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Abstract 


Among many adaptive algorithms that exist in the open literature, the class of approaches 
which are derived from the minimization of the mean squared error between the output of 
the adaptive filter and some desired signal, seems to be the most popular. Probably the 
simplest algorithm belonging to this class is the Least Mean Squared (LMS) algorithm 
which has the advantage of low complexity and simplicity of implementation. One of 
the main concerns in all practical situations is to develop algorithms which provide fast 
convergence of the adaptive filter coefficients and in the same time good filtering perfor- 
mance. There are four main classes of applications where the adaptive filters were applied 
with success, namely: system identification, inverse modeling, prediction and interference 
canceling. In this thesis we develop new algorithms for the first two classes of applications 
although they can be implemented also for prediction and interference canceling. 

In this thesis several new algorithms for adaptive filtering are introduced. The main 
goal is to improve the performances of the existing algorithms, in terms of convergence 
speed and filtering performance and also to introduce some new approaches. The new 
algorithms are classified into several classes each of them addressing a certain application. 

It is well known that the LMS algorithm has a slow convergence for correlated inputs. 
Moreover its filtering performance and convergence speed are inversely related through 
a single parameter, the step-size. An adaptive filter implementing the LMS might have 
stability problems operating in a non-Gaussian environment due to the use of instanta- 
neous gradient to update the coefficients. In applications belonging to the class of system 
identification, not only the values of the coefficients of the model are of interest, but 
also the length of the model. Therefore algorithms for length adaptation might be of 
equal interest. Another situation, that can appear in identification applications is when 
the coefficients of the model are time-varying. The adaptive algorithm should provide a 
mechanism to track the changes of the model. 

This thesis contains three main parts which are concentrated on time domain im- 
plementations, transform domain implementations and applications respectively. At the 
beginning of the first part two new variable step-size LMS algorithms are introduced which 
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show good convergence speed. The dependence between the speed of adaptation and fil- 
tering performance is reduced and the setup of the parameters is very easy as compared 
with other existing approaches. The problem of length estimation is addressed later on 
and an algorithm to iteratively adjust the length of the adaptive filter toward the length 
of the model is proposed. This algorithm is derived for system identification application. 
Next, the problem of tracking time-varying systems is discussed and the analytical ex- 
pressions for the steady-state mean squared error and mean squared coefficient error are 
revised. Based on these expressions a new algorithm in which the step-size is iteratively 
modified toward the optimum is introduced. An important feature of the proposed algo- 
rithm is the fact that the user does not need to know any information about the statistics 
of the optimum model. 

At the end of the first part the class of order statistics LMS algorithms is discussed 
and a new algorithm belonging to this class is introduced. The new algorithm uses an 
adaptive filter to smooth the gradient such that it does not require the knowledge of the 
noise distribution in order to be implemented. 

The second part of the thesis is dedicated to the transform domain implementation of 
the LMS algorithm and its variants. First, three new algorithms belonging to the class 
of variable step-size LMS algorithm in transform domain are introduced. To the best of 
our knowledge the idea of step-size adaptation in transform domain, based on the output 
error was not addressed so far in the open literature. The existing approaches, assume a 
time-varying step-size due to the power estimates of the transform coefficients, whereas in 
our implementations, the step-size is adapted by the output error. We continue with the 
problem of time-varying modeling using the transform domain LMS and we introduce a 
new algorithm. The aim of this algorithm is to increase the convergence speed of its time 
domain counterpart and also to reduce its complexity. 

At the end of the second part the scrambled LMS is briefly presented and it is compared 
with the LMS and transform domain LMS. The chosen framework is the digital data 
transmission over a telephone line. The analytical expressions of the mean squared error 
and mean squared coefficient error are derived for this special case and a discusion about 
their convergence speed and steady-state error is given. The aim of this discusion is to 
provide some useful information about the utility of each algorithm. 

In the first two parts of the thesis, computer experiments, showing the performances 
of all the proposed algorithms, are provided for system identification application. Since 
many of the algorithms can be implemented also in other applications, in the third part 
of this thesis, channel equalization, CDMA multiuser detection and echo cancellation 
applications are also addressed. 
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Notations 


Chapter 1 
Introduction 


During the last decades the adaptive filters have attracted the attention of many re- 
searchers due to their property of self-designing [41]. In applications where some a priori 
information about the statistics of the data is available, a linear filter optimal for that 
application can be designed in advance (e.g. the Wiener filter which minimizes the mean 
squared error between the output of the filter and some desired signal). In the absence of 
this a priori information a solution is to use adaptive filters which possesses the ability 
to adapt their coefficients to the statistics of the signals involved. As a consequence, the 
adaptive filters and algorithms were successfully implemented in a wide variety of de- 
vices for diverse application fields such as communications, control, radar and biomedical 


engineering, to mention a few. 


Adaptive filtering comprises two basic operations: the filtering process and the adap- 
tation process. In the filtering process an output signal is generated from an input data 
signal using a digital filter, whereas the adaptation process consists of an algorithm which 
adjusts the coefficients of the filter to minimize a desired cost function. There is a large 
variety of filter structures and algorithms used in adaptive filtering, each of them being 
more suitable for a certain application. We first classify the adaptive filters into two 
main categories: the Adaptive Finite Impulse Response (AFIR) and the Adaptive Infinite 
Impulse Response (AIIR) filters. Moreover, in the class of AFIR filters there are three 
different filter structures, namely: the transversal filter depicted in Fig. 1.1, the lattice 
predictor as shown in Fig. 1.2 and the systolic array (see Fig. 1.3) [41]. There are other 
FIR structures such as, subband FIR adaptive filters and frequency-domain adaptive fil- 
ters, to mention a few. The first part of this dissertation addresses the algorithms for 
transversal adaptive FIR filters as depicted in Fig. 1.4, where hi(n),...,hy(n) are the 
coefficients of the adaptive filter at time instant n, x(n) is the input sequence, y(n) is 
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Introduction 


Figure 1.2: The block diagram of a lattice predictor. 


the output sequence, d(n) is the desired signal and e(n) represents the output error. The 
second part of the dissertation is dedicated to the transform domain adaptive filters which 
can be directly obtained from the structure shown in Fig. 1.4 including an orthogonal 
transform at the input of the adaptive filter. 

In connection to Fig. 1.4, the coefficients h(n) are changed at each iteration by means 
of an adaptive algorithm. Among many adaptive algorithms, probably the most known 


is the Least Mean Squared (LMS), which was derived from the minimization of the mean 
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Figure 1.3: The block diagram of a rectangular systolic array. 


Figure 1.4: The block diagram of an adaptive transversal FIR filter. 


Introduction 


squared error [25], [41], [43], [78], [83]: 


J(n) = E {e?(n)} (1.1) 


Many other adaptive algorithms based on minimization of other cost function also exist 
such as the Least Mean Fourth algorithm [76], Least Mean P-Power algorithm [62] and 
algorithms with adaptive cost function [71], [72], to mention just few. This dissertation 
addresses the class of adaptive algorithms derived from the minimization of the mean 
squared error which include the LMS algorithms and its variants. When the LMS is 
used to adapt the filter coefficients, the following update equation is implemented at each 
iteration: 

h(n + 1) = h(n) + pe(n)x(n). (1.2) 
where h(n) = jiu (n), sine hy(n)| is a N x 1 vector which contains the filter coefficients, 
x(n) = [x(n),...,a(n — N +1)}’ is the input vector containing the present and past N—1 
samples from the input sequence and p is a constant called the step-size which controls 
the convergence and the stability of the algorithm. 

An AFIR filter using (1.2), in the adaptation process, converges close to the Wiener 
filter after a number of iterations called the transient period [41], [78]. In communica- 
tion applications, during the transient period, the transmission of data is not possible 
therefore, a fast adaptation is one of the main concerns. The LMS algorithm described 
in (1.2) has a very low computational complexity (number of additions, subtractions, di- 
visions, multiplications per iteration) and memory load, which makes it very attractive 
for practical implementations. It is well known that the step-size jz influences the per- 
formances of the adaptive filter [41]. Despite its low complexity, the LMS has also some 
drawbacks which influence its performance in terms of convergence speed and accuracy. 
Much research has been done during the last decades in order to develop algorithms which 
eliminate or at least reduces the drawbacks of the LMS [1], [4], [5], [7], [9], [11], [12], [14]- 
[19], [21], [23], [27], [28], [30], [33], [38], [39], [41], [46], [48], [58], [65], [80]. 

We shall review here some of the main inconveniences of the LMS algorithm. Based 
on this, we classify the adaptive algorithms in terms of their goal and the problem they 
address to. 


1. First drawback of the LMS is its trade-off between small steady-state mean squared 
error and fast convergence. For small values of yu in (1.2), the convergence of the 
filter coefficients is very slow, but the steady-state mean squared error (MSE) is 
small. On the contrary, for larger 44 the convergence speed is increased and also 
the steady-state MSE. It follows that, in the case of the LMS algorithm, it is not 
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possible to obtain fast convergence and small steady-state error at the same time. In 
all practical applications, the main goal is to obtain accurate and fast convergence, 
therefore some new adaptive algorithms which increase the convergence speed of the 
LMS while maintaining a low level of the MSE are necessary. 


2. It was established in the open literature, that the speed of convergence of the LMS 
depends on the eigenvalue spread of the input autocorrelation matrix [4], [25], [30], 
[41], [51], [56], [63]. For large eigenvalue spread, the LMS has a slow convergence 


while faster convergence is obtained for eigenvalue spread close to unity. 


3. The coefficients of the adaptive filter are updated using an estimate of the cost 
function gradient, e(n)x(n). In all applications, the signals involved (see Fig. 1.4) 
might be corrupted by additive noise. When the noise is present in the desired 
sequence d(n) or in the input sequence x(n), will interfere also in the coefficients 
adaptation process through the term e(n)x(n). Due to this fact, in applications 
where the distribution of the noise is highly impulsive, the LMS might have low 
convergence and stability problems. 


4. In the block digram shown in Fig. 1.4, we have assumed that the optimum length 
of the adaptive filter is a priori known. This is not always true in practice, since the 
statistics of the signals involved are unknown, and so the optimum number of the 
coefficients of the adaptive filter. In some applications, an estimate of the optimum 
length may be of interest. 


5. Tracking capability is another very important characteristic of an adaptive filter. 
In stationary environments, there is a linear dependence between the step-size py 
and the steady-state MSE. In this case, the optimum Wiener filter has constant 
coefficients and by decreasing the step-size, the steady-state MSE is reduced. When 
the environment is non-stationary, such as, fading communication channels, the 
dependence between the steady-state MSE and yp is not linear anymore. Actually, 
this nonlinear function has a minimum which is obtained for a certain flop. The 
value of {top depends on the statistical variation of the environment. It follows 
that, in non-stationary environments, in order to have small adaptation errors, the 


step-size must be closer to the optimum [opr. 


There have been many adaptive algorithms introduced in the open literature which 
try to solve one or more of these inconveniences of the LMS. Depending on the problem 
they addres one can make the following classification of adaptive LMS algorithms: 
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Cl 


C2 


C3 


C4 


C5 


C6 


C7 


C8 


1.1 


Variable Step-Size LMS (VSSLMS) algorithms which uses in (1.2) a time-varying 
step-size ju(n) instead of a fixed one. Their main goal is to speed up the convergence 
while maintaining a small level of the steady-state MSE and also to reduce the trade- 
off between steady-state MSE and convergence speed; 


Order Statistic LMS (OSLMS) algorithms which improve the convergence of the 


adaptive filter in non-Gaussian noise environments; 


Variable Length LMS (VLLMS) algorithms which estimate not only the coefficients 
of the adaptive filter also its optimum length; 


Variable Step-Size LMS algorithms for time-varying environments. The algorithms 
from this class are different from the VSSLMS in Cl due to the fact that their 
primary goal is to adapt the step-size ju(n) toward the optimum /1,,¢ which minimize 
the steady-state MSE; 


Transform Domain LMS (TDLMS) algorithms which use an orthogonal transfor- 
mation at the input, such that the input autocorrelation matrix is diagonalized. 
This class of algorithms was introduced to improve the convergence of the LMS in 
applications where the input sequence is highly correlated; 


Transform Domain Variable Step-size (TDVSLMS) algorithms which are the com- 
bination of VSSLMS and TDLMS and possesses high convergence speed for both 
correlated and uncorrelated input sequences. They also reduce the trade-off between 
steady-state MSE and convergence speed; 


Transform Domain LMS algorithms for time-varying environments, which were in- 
troduced to increase the convergence speed and to simplify the structure of the 
algorithms in C4; 


Other decorrelation techniques, such as, the scrambled LMS (SCLMS), which per- 
forms the decorrelation of the input sequence by means of a scrambling device. 
However the SCLMS was first introduced in applications where secure data trans- 


mission was necessary. 


Overview of the thesis 


The dissertation consists of three main parts. According to the above classification of the 


adaptive algorithms, all classes from C1 to C8 are discussed in Chapter 2 and Chapter 3. 


1.2 Author’s contribution 


Chapter 2 deals with the time domain implementations of adaptive algorithms belong- 
ing to classes Cl to C4. First, we start with a brief theoretical analysis of the standard 
LMS and some algorithms with variable step-size that were proposed in the open litera- 
ture. Next, two novel Variable Step-Size LMS algorithms are introduced and described 
in detail. A discussion about the effects of the miss-estimation of the filter length is given 
in the sequel and based on this, a variable length LMS algorithm for uncorrelated input 
sequences is presented. Subsequently, a theoretical analysis of the LMS for non-stationary 
environments is described in detail and based on this analysis a novel adaptive algorithm 
with optimum step-size is introduced. Finally the class of Order Statistics LMS algorithms 
is presented and a new OSLMS algorithm is introduced. 

Chapter 3 is dedicated to the algorithms belonging to classes C5 to C8. First, the 
Transform Domain LMS algorithm is described together with some of its variants intro- 
duced in the literature. Next, three new algorithms which are combinations of VSSLMS 
and TDLMS are presented in more details. A transform domain algorithm with optimum 
step-size for time-varying environments is introduced as an alternative to the time domain 
implementation from Chapter 2. At the end of this chapter, a brief introduction to the 
class of Scrambled LMS algorithms is presented. The theoretical comparison in terms of 
mean squared error, mean squared coefficient error and convergence speed between the 
LMS, TDLMS and SCLMS is also discussed for the problem of digital data transmission 
through a telephone line. The analytical results are supported by simulations performed 
for two types of correlated input sequences. 

Chapter 4 shows the implementations of some of the algorithms from Chapter 2 and 
Chapter 3 in echo cancellation, channel equalization and CDMA frameworks. For the 
channel equalization framework the cases of Gaussian and non-Gaussian noise are ad- 


dressed. 


1.2 Author’s contribution 


The author’s contribution to the existing theory is mainly in Chapters 2-4. To the au- 
thor’s knowledge, no work has been done before towards combining the VSSLMS and 
TDLMS adaptive algorithms to obtain a new class of TDVSLMS algorithms. In the open 
literature, the step-size of the TDLMS is considered time-varying due to the power nor- 
malization and different techniques which improve the decorrelation of the input sequence 
are discussed. The step-size of the algorithms from the novel class of TDVSLMS depends 
also on the output error which highly increases their convergence speed. In addition to 
that, various novel VSSLMS and VLLMS for time domain and transform domain are 
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introduced. The problem of optimum step-size estimation in time-varying environments 


for time domain and transform domain is addressed and two new algorithms are derived. 
A new Order Statistic LMS (OSLMS) algorithm suitable in applications with unknown 
noise distribution is also presented. 


The main contribution of this thesis is in the following points: 


dl), 


Variable step-size LMS algorithms for time domain: in Proceedings of IEEE Wireless 
Communications and Networking Conference EUROCOMM 2000 [18] and presented 
at X European Signal Processing Conference EUSIPCO 2000 {19}. 


. Variable Length LMS algorithm for time domain, presented at JEEE International 


Conference on Electronics, Circuits and Systems, ICECS 2002, [8]. 


. A new Order Statistic LMS algorithm presented at [EEE International Conference 


on Audio, Speech and Signal Processing, ICASSP 2002, [12]. 


. Novel adaptive algorithms with optimum step-size for time-varying environments 


(in both time and transform domain), presented at: 7th WSEAS International 
Conference on Circuits, Systems, Communications and Computers, WSEAS/CSCC 
2008, {15] and IEEE International Symposium on Image and Signal Processing and 
Analysis ISISPA 2008, [14]. 


. A new class of Transform Domain Variable Step-Size LMS algorithms, published 


in IEEE Signal Processing Letters, [9], WSEAS Transactions on Circuits, [10] and 
presented at: XI European Signal Processing Conference, EUSIPCO 2002, [11] and 
IEEE International Conference on Electronics, Circuits and Systems, ICECS 2001, 
[7]. 


. Comparative study between Transform Domain LMS and Scrambled LMS in echo 


cancellation framework, presented at International TICSP Workshop on Spectral 
Methods and Multirate Signal Processing, SMMSP 2003, |6] 


The author has done the basic derivations and most of the experimental and writing work 


in all these publications. The author fulfilled the publications task with the supervisor 


and the co-authors of the papers. Other results related with one part or another of the 
thesis were published in [13], [16] and [26]. 


Chapter 2 
Time domain implementations 


This chapter considers the time domain implementations of four classes of adaptive algo- 
rithms: Variable Step-Size LMS (VSSLMS), Variable Length LMS (VLLMS), the adap- 
tive LMS algorithm for time-varying environments and the Order Statistic LMS (OSLMS) 
algorithms. 

In the first section, the standard LMS algorithm [41], [78] is reviewed and its theoretical 
analysis is considered. In the analysis, we follow three main directions: first the transient 
and steady-state behavior in terms of the mean squared error is presented for the case of a 
stationary environment in which the Wiener filter has time invariant coefficients. Second, 
the problem of length mismatch between the adaptive filter and the unknown filter is 
studied in a system identification framework. The analytical expressions are derived 
for the case when the unknown system has constant coefficients for both correlated and 
uncorrelated input sequences. Finally, the analysis of the LMS algorithm for tracking 
a time-varying system with fixed length is presented and the formula for the optimum 
step-size which minimizes the steady-state MSE is obtained. 

In the second section, the class of Variable Step-Size LMS algorithms is addressed. 
First we describe few of the most cited algorithms in the open literature and next based 
on the results outlined in the first section, two new VSSLMS algorithms are presented. 
Simulations results showing the behavior of these new algorithms for the problem of 
system identification are given in the sequel. The section ends with a comparison in 
terms of computational complexity, memory load and setup simplicity of the mentioned 
VSSLMS algorithms. 


The class of Variable Length LMS algorithms is detailed in Section 2.3 that starts 
with a brief description of some algorithms that were published in the literature. Next 
we introduce a new VLLMS algorithm for system identification, which adjusts the length 
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Adaptive Filter 


h(n) 


Figure 2.1: Simplified block diagram of an adaptive FIR filter. 


of the adaptive filter toward the length of the unknown filter. In the derivation of this 
algorithm we use some theoretical results presented in the first section. The simulations 
results of the proposed VLLMS algorithm for the problem of system identification for 
uncorrelated input sequences, are presented at the end of this section. Length adaptation 
for the case of correlated input sequence is addressed in Chapter 4. 

Section 2.4 is dedicated to the problem of tracking time-varying systems. From the 
theoretical results shown in the first section we directly derive an adaptive algorithm in 
which the step-size is time-varying and converge near the optimum. The new algorithm 
is implemented in the time-varying system identification framework and the simulation 
results are shown. 

The last section deals with the class of Order Statistics LMS algorithms (OSLMS). We 
first briefly review the algorithms from this class and next we introduce a new algorithm. 
The section ends with comparative simulation results of the OSLMS algorithms for the 


problem of system identification for various noise distributions. 


2.1 The Least Mean Square algorithm 


The simplified block diagram of a transversal adaptive FIR filter is depicted in Fig. 
2.1 where the block denoted by AdaptiveFilter comprises an adaptive filter h(n) = 
fi(n), ho(n), shy(n)) and algorithm, x(n) is the input sequence from which the 
input vector x(n) = [x(n), x(n —1),...,2(n — N +1)]' is obtained, e(n) is the output 
error, y(n) is the output of the adaptive filter and d(n) is the desired signal. All the 
theoretical derivations from the present section are referred to this figure. 

In connection with Fig. 2.1 the output of the adaptive filter can be written as follows: 

N 


@(n) = h'(n)x(n) = x"(n)h(n) = S~hi(n)a(n — i +1). (2.1) 


i=1 
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where t is the transposition operator. 


The output error is expressed by the following equation [41]: 
e(n) = d(n) — y(n). (2.2) 


The coefficients of the adaptive filter are updated to minimize the output mean squared 
error defined as follows: 


J(n) = E [e*(n)]| = E {[d(n) — H{n)]"} (2.3) 


The optimum filter coefficients in the mean square sense (the optimum Wiener so- 
lution) are those coefficients for which the partial derivatives of J(n) equals to zero. 
Denoting the vector of the optimum coefficients as h, = [ho,,...,oy|', the system of 
equations which gives h, is obtained as in the sequel: 


b(n) dE [(d(n) — y(n))’] 


No, No, 
= -2E {x(n —-i+1)[d(n) — yo(n)|} = 
= —2E{a(n—i+1)e,(n)} =0, Vidoes Ne 2A) 
where 
e,(n) = d(n) — h’x(n) (2.5) 


is the minimum output error obtained when the coefficients of the adaptive filter equals 
the coefficients of the optimum Wiener filter. 


Equation (2.4) can be written in a more compact form as follows: 
E [x(n)e.(n)] = 0. (2.6) 


It follows from (2.6) that the optimum error is orthogonal to the input vector at each 
time instant n, and this represents the well known principle of orthogonality. 

From (2.4), the Wiener-Hopf equations which give the coefficients of the optimum 
filter are represented by: 


ho, r(l —1) +A. r(1 — 2) +--+ +hoyr(1 — N) = p(0) 
Not (2 — 1) + hor (2 — 2) +++ + howr(2— N) = pl) 


hatN —1)+h,,r(N —2)+---+hoyr(N — N) = p(N) 


where r(i—j) = E [x(n — i)2(n — j)], p(t) = E [d(n)2(n —1)] and p = [p(1), p(2),--.,p(N)]’. 
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We note that the terms r(i—j) = r(j —7) and r(i—1) = r(j —j) = r(0) Vi, j, therefore 
the matrix R can be written as: 


r(0) r(1) r(2) r(N —1) 
Be no ce ae : : r(N — 2) . (28) 
r(N—1) r(N-—2) r(N—-83) ... r(0) 


When the matrix R is invertible and its elements can be estimated, the optimum 
Wiener filter can be easily obtained from (2.7) as: 


h, = R'p. (2.9) 


In situations when the elements of the matrix R are not available an iterative algorithm 
can be applied to the adaptive filter which transforms its coefficients toward h,. One 
simple adaptive algorithm is the Steepest Descent (SD) algorithm, which updates the 
coefficients of the adaptive filter at each iteration in the opposite direction of the cost 
function gradient. In the case of the SD, the update formula for the filter coefficients is: 


a ~~ 1 
h(n + 1) = h(n) — sh V J(n), (2.10) 
where VJ(n) = [et ane) a) | and 
V ~~ | dhi(n)? dha(n)?***? bhn(n) 
a | 
OUD), 5 Ga dee Nt (2.11) 


In order to compute the elements of the gradient in (2.11), the expectation operator 
must be used. A simpler alternative is to use the instantaneous gradient instead of the 
true gradient and the obtained algorithm is called the Least Mean Square (LMS). As a 
consequence, the LMS algorithm uses the following coefficient update formula: 


h(n + 1) = h(n) + pe(n)x(n). (2.12) 


where the step-size js was introduced to control the stability of the algorithm. 


Finally, the LMS algorithm can be described by the following four steps: 
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Least Mean Square algorithm: 


[At every iteration n do: 


1. Form the input vector x(n) = [x(n),a(n—1),...,2(n —N+1)]’ from the input 


sequence x(7). 
2. Compute the output of the adaptive filter: 9(n) = x!(n)h(n) = h'(n)x(n); 
3. Compute the output error: e(n) = d(n) — y(n); 


4. Update the coefficients of the adaptive filter: h(n +1)= h(n) + pe(n)x(n); 


2.1.1 Analysis of the LMS for stationary environments 


We point out some important results of the transient and steady-state analysis of the 
mean square error of the LMS algorithm for stationary environments which will be used 
further in the subsequent sections. The following fundamental assumptions are used in 


order to make the convergence analysis of the LMS mathematically tractable! [41]: 
a. The input vectors x(1), x(2), ..., x() are statistically independent from each other; 


b. The input vector x(n) at time instant n is independent of all previous samples of 
the desired sequence, d(1), d(2), ..., d(n — 1); 


c. The desired response d(n) at time instant n depends on the corresponding input 
vector x(n), but it is statistically independent of all previous samples of the desired 


response; 


d. The input vector x(n) and the desired response d(n) consist of mutually Gaussian- 
distributed random variables at all time instants n. 


We first start with the analysis of the coefficient error vector defined as: 
Ah(n) = h(n) — hy, (2.13) 


where h, is the vector of the optimum coefficients given in (2.9). 
Subtracting the optimum coefficients vector from (2.12) and using (2.13), one obtains: 


Ah(n + 1) = h(n +1) —h, = h(n) — h, + pe(n)x(n) = Ah(n) + pe(n)x(n), (2.14) 


lThese four assumptions form the so called independence theory which, even though is violated in 
many practical applications, was proved to retain sufficient information about the adaptive process such 


that to allow the derivation of quite reliable design guidelines. 
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Subsequently, the formula to compute the output error can be rewritten as follows: 


e(n) = d(n) — h'(n)x(n) = d(n) — h’x(n) + h’x(n) — h'(n)x(n) = e,(n) — A(n)*x(n), 
(2.15) 
where e,(n) is the minimum error defined in (2.5). 
From (2.14) and (2.15) the coefficient error vector can be obtained as follows: 


Ah(n+1) = Ah(n) — px(n)x’(n)Ah(n) + wx(n)eo(n) 
[I — px(n)x*(n)] Alh(n) + px(n)e,(n), (2.16) 


where I is the N x N identity matrix. 
Taking the mathematical expectation on both sides of (2.16) we obtain: 


E[Ah(n + 1)] = E {[I - px(n)x*(n)] Ah(n)} + wE [x(n)e(n)] . (C217) 


The last term in (2.17) vanishes due to the principle of orthogonality expressed by 
(2.6). From (2.12) it follows that the coefficient error vector Ah(n) is independent of 
x(n) and as a consequence, (2.17) simplifies as follows: 


E[Ah(n + 1)] = [I — wR] E [Ah(n)}. (2.18) 


The stability condition for the step-size j1, which ensures the convergence of the adap- 
tive filter coefficients in the mean can be obtained from (2.18) after some mathematical 
manipulations as follows (see [41] for more details): 


Os (2.19) 


Xmax 
where Amar is the maximum eigenvalue of R. 

To obtain an analytical expression for the output mean squared error, we first compute 
the correlation matrix of the coefficient error vector C(n). The matrix C(n) is defined as 
C(n) = E | Ah(n)Ah'(n)| and it is obtained from (2.18) as follows (see [41] for detailed 


analytical derivations): 
C(n+1) = C(n)—p[RC(n) + C(n)R]+p?Retr [RC(n)|+2u?RC(n)R4+p7JminR. (2.20) 


where Jmin = E [e2(n)] is the minimum MSE obtained in the case of perfect adaptation. 


oO 


Recalling equation (2.3), the MSE can be further detailed as in the sequel: 
J(n) = E|(d(n) —h'(n)x 


= E eo(n) + h’x(n 
= El(eo(n) — Abt(n)x(n)) (eo — x!(n)Ab(n)) 
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where Ah = h(n) — h, and h, is the optimum Wiener solution. 
Finally, taking into account the above assumptions and after some mathematical ma- 
nipulations, the MSE can be expressed as (see [41] for more details): 


I(n) = Jmin + tr [RC(n)], (2.22) 


where R is the input autocorrelation matrix, C(n) is the crosscorrelation matrix of the 
coefficient error vector and Jin is the minimum MSE which is obtained in the case of 
perfect adaptation (if h(n) = hy}. 

The steady-state value of the mean squared error, can be expressed by combining 
(2.20) and (2.22) for n — oo, and the following analytical expression results [41]: 


ty = din + Imin — ; (2.23) 


which for small values of the step-size js becomes: 
LU 3 Ll 
Js aa a ears + 9 dain = Vi me deni te 2 drt O, —_ denies + hers (2.24) 


where .; is the i eigenvalue of R and o? = r(0) is the variance of the input sequence 

It results from (2.23) and (2.24), that the steady-state MSE contains two components. 
First component Jin is the minimum MSE which can be achieved in the ideal case of 
perfect adaptation, when the coefficients of the adaptive filter equals the coefficients of 
the Wiener filter. The second component J.,, called the excess MSE, depends upon the 
step-size jz and it is due to the missadaptation of the filter coefficients. We shall retain 
these two equations which will be recalled later during this thesis. 

The misadjustment is defined as the ratio between the excess MSE and the minimum 
MSE and it is given by [41], [78]: 


Ny sae 
J 2 Tora 
M= ca — (2.25) 
min ri 
1— > ree 
and for small step-size the above expression can be simplified to: 
N 
M =o? (2.26) 
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The average learning curve of the LMS algorithm can be approximated with an exponen- 
tial with time constant 7. In this case the misadjustment can be expressed as follows: 


LN Xa N 

= —_. 2.27 

2 Ar ( ) 
N 

where Agy = ¥ y= A; is the average eigenvalue of R, and 
i=1 
1 
= —., 2.28 
PSD eee 


is the average time constant which is proportional to the length of the transient period of 
the algorithm. 
We should note, that the condition for the convergence in the mean square sense (the 
condition which ensures J(n) — Jg:) is given by [41]: 
2 2 


N ~ 3N 2° 
eee oe 
1=1 


O<p< (2.29) 


It results from (2.27) and (2.28), that for a given filter length N, the misadjustment 
is direct proportional with the step-size and the convergence time is inverse proportional 
to p. As a consequence, if two adaptive filters with the same length N and different 
step-size are compared, the faster convergence is obtained with the filter that has larger 
step-size. However, the filter with smaller step-size will converge to a smaller level of the 
misadjustment M. This is a very important conclusion which is the basis for derivation 
of the class of Variable Step-Size LMS algorithms. 


2.1.2 Mean squared error as a function of the adaptive filter 
length 


In the previous section, we pointed out some important analytical expressions for the 
optimum Wiener solution, MSE and coefficient error vector of an adaptive FIR filter that 
uses the LMS algorithm to update its coefficients. From the analytical expression (2.24) 
of the steady-state MSE we see that it depends upon Jmin which is the minimum MSE 
in the case of perfect adaptation. In some applications, such as system identification, the 
desired signal d(n) in Fig. 2.1 might be generated by an FIR filter of a certain length (for 
instance in echo cancellation the sequence d(n) represents the echo generated by the echo 
path which is modelled as an FIR filter). Usually, in this kind of applications the length 
of the filter generating the desired sequence is not known and the length of the adaptive 
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system hy 


AdaptiveFilter 
hyve (7) 


Figure 2.2: System identification block diagram for the case when the unknown filter and 
the adaptive filter have different lengths. 


filter is chosen by the user. As a consequence, when there is a mismatch between the 
lengths of the two adaptive filters, there is a bias term which increases the term Jmjn in 
(2.24) and also the steady-state value of the MSE. In this section, we extend the previous 
analysis with an emphasis in the effect of the length adaptation. We focus on the problem 
of system identification as shown in Fig. 2.2 in which the desired sequence d(n) is obtained 
at the output of an FIR filter (called unknown system) of length N while the adaptive 
filter has different length Nag. 


To this end, we first change some notations in order to emphasize the difference be- 
tween the lengths of the adaptive filter and the unknown system. We denote by N, hy, 
Nag and h Ng, the length of the unknown system, the coefficients vector of the unknown 
system, the length of the adaptive filter and the coefficients vector of the adaptive filter 
respectively. 

With reference to Fig. 2.2, the adaptive filter coefficients are updated by the following 


formula: 


hy, (m + 1) = hy,4(n) + wxn,4(re(n); (2.30) 


where yp is the step-size and xy,,(n) is the vector of the past Nag input samples. 


The output error is computed as: 


e(n) = y(n) — yn) + o(n); (2.31) 
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where u(n) is the output noise, 


y(n) = h\xn(n) = 2 hya(n —i +1), (2.32) 
G(n) = hh, ,(n)xn,4(n) = Yhaln)a(n —~i+1), (2.33) 


where N and Noa respectively indicate the lengths of the corresponding vectors. 
The coefficients vectors, hy and hy,,(n), are denoted as follows: 


es aS Sa Bs t 
hy = (hi, he,...,An]’, and hhy,,(n) = fia (n), ha(n),..-, a) . (2.34) 
and the input vectors are written as: 


xy(n) = [x(n),...,a(n-_N+1)]’ and xy,,(n) = [x(n),...,2(n — Noa + 1)]' 
(2.35) 
We should emphasize that the lengths N and Naq are usually not equal since the length 
N of the unknown filter is not known and the length Nog of the adaptive filter is chosen 
by the user. Of course, in applications where the length estimation is not of primary 
concern, a long adaptive filter might be implemented without introducing a bias term in 
the steady-state MSE as will be more clear in the sequel. 
= Pasesesdt 
error, are obtained from the Wiener-Hopf equation. We recall here equation (2.27) in 


which minimize the mean squared 


The optimum coefficients h ee 


ONad 


which we explicitly write the terms p;(n) and we assume that the output noise v(n) is a 


random Gaussian-distributed, zero mean sequence independent on x(n): 


ho, 7 (0) + Aer (1) + +++ + Ronag?(Nad -—1) = far(0) + Aer(1) +--+ + hyr(N — 1) 
ho, T(1) + Aer (0) + +++ + Ronag?(Nad — 2) = far(1) + Aer (0) +--+ + hyr(N — 2) 
hot (Naa — 1) Se he seete tacts NR + Ronaa? (0) = hir( Naa — 1)+...+Anr(N — Naa) 
(2.36) 
Subsequently, (2.36) can be written in a compact form: 
Ryadx NadMoy aa — Ryadxnhy. (2.37) 
where 
r(0) r(1) r( Naa — 1) 
r(1) r(0) r( Nad — 2) 


Rwadx Nod = EL [Xwaa(N)Xvaa(7)| = 
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r(0 r(1) r(N —1) 
Reason = E fevedndxiny] =] TM NY a0 
r(Naa—1) r(Naa— 2) ... rN — Naa) 


and the subscripts indicate the sizes of the corresponding vectors and matrices. 

We note that the matrix Ryaaxnaa in (2.37) is a square matrix, whereas Ryaaxn 
has different number of rows and columns. As a consequence, in order to compute the 
optimum coefficients, three cases must be taken into account namely N < Nag, N = Naa 
and N > Nag as follows: 


Case 1: N < Ngq In order to obtain the optimum coefficients, the vector of the 


unknown system is padded with zeros and we obtain the following vector of length 
Nad: 
hy,4 = Pitas fing Oe. 5 0) ee (2.40) 


Subsequently, (2.37) can be written as follows: 
RwyadxNadNoy,, = RnadxNadnaa; (2.41) 
and the optimum solution is given by the following equation: 
Hig. a= Hive: (2.42) 
with hygq given in (2.40). 


Case 2: N = N,q The optimum solution for this case is the same as in the previous 


case with the difference that the vector of the unknown filter is not padded with 


zeros since h,,,,, and hy are of the same length. 


Case 3: N > Naq To compute the optimum coefficients vector, first we rewrite 


(2.37) in an equivalent form: 


Rwadx NadNoy aa = Rwadx Nadhnad a Rwadx(N—Nad)Nn—Naa; (2.43) 


where the vectors and the matrices from the right-hand side of (2.43) are given by: 


wad = [Aijas<3 final 5 hy—Nad = [reese eee | (2.44) 
tUNeg) <f UN PA) ed TN =) 
r(Naa—1)  r(Naa) --. = Tr(N -2) 


Ryaax(N-Nad) = (2.45) 


rl) 2) (Nga) 
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and RwaaxNad iS given in (2.38). 


Multiplying (2.43) by Ry a. veg at left, the vector of the optimum coefficients is 


obtained: 


Hovea = waa + Riynaxnadktwadx(N—Naad) n-Nad- (2.46) 


Finally, combining (2.42) and (2.46) the optimum Wiener solution can be expressed 
in a compact form as follows: 


Pieces Ohta a if N < Naa 
Soon [ete Pinel pp if N= Noa (2.47) 
hyad + Riygax NadRtNadx(N-Nad)n—naad if N > Noa 


As a result, when the length Nga of the adaptive filter exceeds the length N of the 
unknown filter, the vector of the optimum coefficients is obtained by padding with N— Nag 
zeros the vector hy. On the contrary, when the length of the adaptive filter is smaller 
than the length of hy, the optimum coefficients are obtained adding a bias term to the 
corresponding coefficients of the unknown filter as shown in (2.47). 

In the case of uncorrelated inputs, the elements of the matrix Ryeax(N—Naa) are all 
zeros and in (2.46) we have h,,,,, = Anaad with bya given in (2.44). Consequently, for 
uncorrelated input sequences the vector of the optimum coefficients is given by: 


ited tea Ola Ol pes) We: IN aN 
Oo ition <4 lial if Nag = N (2.48) 
[Ai,...,hw,,]) if N > Naa: 


Nad? 


To complete the steady-state analysis of the LMS algorithm for the case of length 
missmatch, we shall study also the output mean squared error. To this end, we recall 
equation (2.22) which gives the output MSE where we explicitly denote the sizes of the 


matrices and vectors involved: 


JI(n) = Jmin + tr [RnadxNadC NadxNad(n)] , (2.49) 


where RyadxNad iS given in (2.38), 
Cnaax Nad?) =f { Aby,, (n)Ah‘,,(n) } ) (2.50) 


Ahy,,(n) = hy,,(n) — hh (2.51) 


ONad? 


and ha,,,, is given by (2.47) for correlated inputs and (2.48) for uncorrelated inputs. 


ONad 
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The steady-state MSE is obtained following similar derivations as in the previous 


section and we obtain: 


Jet = E{e?(00)} = Imin | 1+ —= ; (2.52) 


where ; are the eigenvalues of the input signal autocorrelation matrix Ryaax Nad: 


For small values of the step-size ju, (2.52) simplifies to: 
is iC ll 
= S 23 = 2 
J st — See (: 5 9 = s) = 5 @ +: mu [Rvadxwaal i ae =F Jin Nad: (2.53) 


Noao2 
From (2.53) we see that the misadjustment M = cede do not depend on the dif- 


ference between the lengths of the to filters (unknown system and the adaptive filter). 
Intuitively we can explain this by the following fact: if the length of the adaptive fil- 
ter is larger than the length of the unknown system, the extra coefficients converge to 
zero. However, they will no be exactly equal to zero but will oscillate around zero. These 
small oscillations of the extra coefficients will generate misadjustment exactly as the other 
non-zero coefficients. When the length of the adaptive filter is smaller than the length 
of hy, the optimum coefficients are obtained adding a bias term to the corresponding 
coefficients from hy. The bias terms are non-zero if the input is correlated and they are 
zero for uncorrelated inputs. However the misadjustment is the measure of adaptation to 
the optimum solution which is biased and not to the coefficients of the unknown system. 

To obtain the final expression of the steady-state MSE, the value of Jimin must be 
expressed for three different cases: Nag < N, Nag = N and Nag > N. 


Analysis of the minimum mean squared error 


The value of the minimum mean squared error, Jmin, in (2.52) and (2.53) is obtained 
in the case of perfect adaptation (when the coefficients of the adaptive filter equals the 
optimum coefficients) and it is given by: 


Jmin = E {e2(n)} = E {[y(n) + o(n) — yo(n)]"} = EB {v(n)} + B tn) —yo(n)]"}, 
Jmin = O2 +E Ds hya(n —i+1)- 3 ho,z(n —t+ 1) 


(2.54) 
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where y,(n) = ss ho,x(n —i+1) is the optimum output and u(n) is a zero mean random 
Gesiasian- distributed sequence with variance a? and independent of x(n). 
It follows from (2.47), that Jmin for the case of a correlated input sequence, can be 
written as: 
o if N < Naa, 
iad Nad N 
way OP+ ES | So (hi-ho) e(n-it+l+ YS hiz(n-i+1) if N > Naa 
i=1 i=Ngatl 
(2.55) 
For uncorrelated input sequence x(n), E {a(n —i+1)a(n—j4+1)} =0 fori ¥ j, and 
the vector of the optimum coefficients is given by (2.48). As a result, equation (2.55), in 
the case of uncorrelated input, simplifies to: 


o?, if N < Naa; 
= N 
Jrmin o7+ >> h?r(0), if N> Naa ey 
i=Neatl 


Simulations and results 


In order to test the above analytical results two sets of simulations were done. In the 
first set, the behavior of the MSE and the behavior of the coefficients in the mean, for 
correlated input sequence is studied. By doing this analysis we are interested to check the 
validity of (2.47) and (2.55). The second set of simulations was done for uncorrelated input 
sequence with a view to verify the validity of (2.48) and (2.56). Both sets of simulations 
were done in system identification framework as depicted in Fig. 2.2 for the case when 
the the length of the adaptive filter was smaller than the length of the unknown system 
(Naa < N) and also for the situation when the adaptive filter has more coefficients than 
the unknown system (Naa > N). 

In all simulations, a number of 100 independent Monte-Carlo simulations were done, 
each of them consisting of a number of 5 x 10* iterations and the results were averaged. 
The coefficients of the adaptive filter were initialized with ones although in practice, when 
there is no information about the optimum solution, usually they are initialized with zeros. 
We have chosen this kind of initialization to show that the extra coefficients of the adaptive 
filter, (when Naq > N) are adapted toward zero independently of initialization. 

Correlated input sequence: 


1. N > Naa: In this case, the length of the unknown FIR filter hy in Fig. 2.2 was 
N = 9, the length of the adaptive filter was N = 5 and the input sequence x(n) was 
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generated by the following model: 

x(n) = 1.79a4(n — 1) — 1.85a(n — 2) + 1.272(n — 3) — 0.41a(n — 4) +9(n) (2.57) 
where 7(n) is a random zero mean Gaussian-distributed sequence with variance 
chosen, such that the variance of x(n) is unity (a? = 1). 


The trace of the input autocorrelation matrix Rwadx Naa was tr [Rwadx Nad] = Naao? = 
5. It follows, that the condition for the convergence in the mean square sense in 
(2.29) is w < 0.13. A step-size 4. = 10~? which satisfy this condition, was chosen. 
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Figure 2.3: The coefficients hi(n) to hs(n) of the adaptive filter, during the adaptation 
(continuous line) and the corresponding coefficients of the unknown system (dashed line) 


for correlated input sequence and N > Nqq. 


The output MSE is shown in Fig. 2.4 together with the steady-state MSE obtained 
when the adaptive filter and the unknown filter are of equal length (Naq = N = 9). 
We can see from this figure, that the obtained steady-state level of the MSE is 
higher than the one obtained for equals lengths, which verifies the analytical result 
of (2.55). It follows that for a correlated input sequence and N > Naa, the steady- 
state MSE has a bias term that is due to the extra coefficients of the unknown system 
which does not have correspondences into the adaptive filter. Unfortunately, there 
is no proof that larger differences in the length of the filters increase the bias term. 
In fact, it is possible in some cases, that the extra coefficients of the unknown system 
have negative values and therefore, the MSE bias can decrease when the difference 
between the lengths of the two filters increases. 
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Figure 2.4: Mean squared error for correlated input sequence and N > Naa. 


The coefficients of the adaptive filter during the adaptation are plotted in Fig. 2.3, 
where the corresponding coefficients of the unknown system are also shown. The 
plotted learning curves clearly show that, at the steady-state there is a bias in every 
coefficient of the adaptive filter which verifies the analytical result of (2.47). 


We can conclude that, there is a bias term in every adaptive filter coefficient, in 
the case of correlated input sequences when the adaptive filter is smaller that the 
unknown system. These bias terms prevent the coefficients of the adaptive filter to 
converge close to the corresponding coefficients of the unknown system. In other 
words, the coefficients of the optimum filter are not equal with the coefficients of 
the unknown system. The misadjustment of the adaptive filter (that is a measure 
of the adaptation to the optimum filter) is not influenced by the bias. On the other 
hand, the minimum mean squared error Jmjn is affected by the bias terms of the 


coefficients and by the non-zero elements of the matrix Radx(N—Nad): 


. N < Nga: In the simulations performed for this case we have used the same unknown 


system of length N = 9 and the same correlated input sequence given by the model 
in (2.57). The length of the adaptive filter was N,4 = 11 and the step-size used to 
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Figure 2.5: The coefficients hi(n) to hg (n) of the adaptive filter, during adaptation (con- 


tinuous line) and the corresponding coefficients of the unknown system (dashed line) for 


correlated input sequence and N < Nag. 
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Figure 2.6: The coefficients hz(n) to hi(n) of the adaptive filter, during the adaptation 


(continuous line) and the corresponding coefficients of the unknown system (dashed line) 


for correlated input sequence and N < Nqg. 


update its coefficients was 4 = 10~? which satisfies the stability condition in the 


mean square sense. 


The behaviors of the adaptive filter coefficients are shown in Fig. 2.5 and Fig. 


2.6. In both figures, the corresponding coefficients of the unknown system are also 


plotted as dashed lines. Since the adaptive filter has larger length than the unknown 
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Figure 2.7: Mean squared error for correlated input sequence and N < Naa. 


system, in the last to plots of Fig. 2.6 we have shown the zero level with dashed 
line. From Fig. 2.5 and Fig. 2.6 we can see that the first 9 coefficients converge to 
hy whereas the last two coefficients of the adaptive filter converge to zero, which 
demonstrate the analytical result of (2.47). 


The MSE during the adaptation is shown in Fig. 2.7 together with the steady- 
state MSE obtained when the length of the adaptive filter equals the length of the 
unknown system. From this figure, we can see that the convergence of the MSE for 
N < Nog and for N = Nog are the same. This is due to the fact that the extra 
coefficients of the adaptive filter converges to zero as shown in Fig. 2.6. 


Uncorrelated input sequence: 


1. N > Naa: The behavior of an adaptive filter using the LMS algorithm is studied in 
the system identification setup depicted in Fig. 2.2, for the case when the length 
Naa of the adaptive filter is smaller than the length N of the unknown system and 
the input x(n) is uncorrelated. To this end, we have chosen N = 9, Nag = 5 and 
a zero mean random Gaussian distributed input sequence with unity variance. In 


this case, the trace of the input autocorrelation matrix is tr |[RwaaxNaa| = 5 anda 


2.1 The Least Mean Square algorithm 27 


step-size 4 = 10~? which fulfill the condition for the MSE convergence was used. 
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Figure 2.8: Coefficients for hi(n) to hs(n) of the adaptive filter, during adaptation (con- 


tinuous line) and the corresponding coefficients of the unknown system (dashed line) for 
uncorrelated input sequence and N > Nqaq. 
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Figure 2.9: Mean squared error for uncorrelated input sequence and N > Naa. 


The values of the coefficients during the adaptation are shown in Fig. 2.8 together 
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with the values of the first five coefficients of the unknown system (dashed line). We 
see that the coefficients of the adaptive filter converge close to their corresponding 


coefficients of the unknown system. 


The mean squared error J(n) during adaptation is shown in Fig. 2.9 together with 
the value of the steady-state MSE, denoted as J.;, obtained in the case of equal 
lengths. Clearly, there is a positive bias term due to the extra coefficients of the 
unknown filter (see (2.56)). 


. N < Nga: In order to study the MSE and the coefficient behavior for the case when 


the adaptive filter has more coefficients than the unknown system the same simu- 
lations were repeated for N = 9 and N,zg = 11. The coefficients of the unknown 
system were the same as in the previous experiments, the input sequence x(n) was 
a zero mean random Gaussian-distributed sequence with variance 0? = 1 and a 


step-size ps = 107? was chosen. 


The coefficients of the adaptive filter versus the coefficients of the unknown system, 
during the adaptation are shown in Fig. 2.10 and Fig. 2.11. Due to the fact that the 
adaptive filter has more coefficients than hy, in the last two plots of Fig. 2.11, the 
zero levels are shown with dashed lines. We can see from these plots that the first 
9 coefficients of the adaptive filter converge close to the corresponding coefficients 


of the unknown system, whereas the extra coefficients converge to zero. 


The MSE at the output of the adaptive filter is plotted in Fig. 2.12 together with 
the value of the steady-state MSE obtained for equals lengths (dashed line). We can 
see that, although the adaptive filter has more coefficients, the steady-state value 
of the MSE is unbiased. This result is the consequence of the fact that the extra 
coefficients of the adaptive filter converge to zero and the others converge to hy. 


In conclusion, the analysis of the adaptive FIR filter using the LMS algorithm for the 


problem of system identification was presented. In the analysis, we have been interested 


in studying the effect of the mismatch between the lengths of the unknown system and the 


adaptive filter. The aim of this study is to provide a theoretical basis for the development 


of the class of Variable Length LMS algorithms in which not only the estimation of the 


coefficients is of interest but also the length estimation. 


From the presented analysis, we can conclude the following: when the input sequence 


is correlated and the adaptive filter is smaller than the unknown system the coefficients 


of the adaptive filter converges to the biased values of the first Naa coefficients of hy. In 


this case, the steady-state MSE is also biased as compared with the case of equal lengths. 
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Figure 2.10: Coefficients for hi(n) to he(n) of the adaptive filter, during adaptation 
(continuous line) and the corresponding coefficients of the unknown system (dashed line) 


for uncorrelated input sequence and N < Nga. 
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Figure 2.11: Coefficients for hz(n) to h1,(n) of the adaptive filter, during adaptation 
(continuous line) and the corresponding coefficients of the unknown system (dashed line) 
for uncorrelated input sequence and N < Nqaq. 


When the length of Dna is larger than the length of the unknown system, the first N 
coefficients of the adaptive filter converge to hy and the others to zero. In this case the 
steady-state MSE is the same as in the case of N = Nag. Unfortunately, for correlated 
input sequence there is no direct dependence between the value of the MSE bias and the 
difference between the filter lengths. In some cases, the extra coefficients of the unknown 


30 Time domain implementations 


N<N_, 4 (un-correlated input) 
10 


J (TO yl tH NN, °,/2) 


___ J(n) 


Mean square error (dB) 


500 1000 1500 2000 
Iterations 


Figure 2.12: Mean squared error for uncorrelated input sequence and N < Naa. 


system can have negative values therefore, there is no guarantee that for larger mismatch 
of the lengths the MSE bias is increased. 

For uncorrelated input sequence, the optimum coefficients are equal to the first Naa 
coefficients of the unknown system in the case Nag < N whereas for Nag > N the first 
N coefficients of the optimum solution are equal to the coefficients of hy and the others 
are zero. As a consequence the steady-state output MSE is biased when Nag < N and 
unbiased for Nag > N. More than that, there is a direct dependence between the bias 
level and the length mismatch of the filters. When the adaptive filter has length closer 
to the length of the unknown system the level of the bias is smaller whereas when the 
difference between these two lengths is increased the bias increases. 


2.1.3 Optimum step-size for time-varying environments 


In this section, we study the behavior of the LMS adaptive algorithm for a time-varying 
environment. With reference to Section 2.1.1, we derive here the equivalent Wiener-Hopf 
equation (2.7) for time-varying environment. To this end, we start from the minimization 
of the MSE defined in (2.3). The optimum coefficients are obtained making the partial 
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derivatives of J(n) equal to zero and we obtain: 


No (m)ri-a(n) + ho, (n)ri—a(m) + +++ + Roy (n)ri—n(m) = po(n) 
No, (m)r2-1(m) + ho, (m)r2—2(m) + +++ + Row (n)r2—n(n) = pin) 


nm 
nm 


(2.58) 
No (n)rn—1(M) + Noo(n)rn—2(n) + +++ + Roy (n)ro(n) = pu(n) 


where rj_j(n) = E[a(n—i+1)e(n—J7+4+1)], pi(n) = E[d(n)z(n —i+1)] and p(n) = 
ips (n), pan), .pw(ndlt 

In (2.58), the input sequence x(n) and the desired sequence d(n) were assumed to be 
non-stationary as opposed with (2.3), where x(n) and d(n) were assumed stationary. As 
a consequence, the expectation operator of x(n —i+ 1)a(n — 7 +1) depends on the time 
instant n, such that, F {a(n —14+ l)a(n—74+1)} £4 E{a(m—i+1)2(m— 7 4+1)} for 
n #m. This is why the time instant n appears explicitly in (2.58). The indexes of the 


elements r;_;(n) represent the difference between the time instants of x(n — i+ 1) and 
x(n —Jj+1) and they express the fact that the cross-correlations between different inputs 
of the adaptive filter are computed. For the same reason, the elements of the vector p(n) 
contain information about the time instant at which they were computed (since x(n) and 
d(n) are assumed to be non-stationary). Since the autocorrelation matrix R(n) and the 
vector p(n) are time-varying, also the optimum vector h,(n) is time-varying and (2.58) 


can be written in a compact form as follows: 


R(n)h,(n) = p(n). (2.59) 
where: 
ro(n) r1(n) rnv—1(n) 
R(n) = ry(n) ro(n) ...  TN-2(n) (2.60) 
ry-1(n) Trn_e(n) ... — To(n) 


We note that, (2.59) includes all the non-stationary situations that can appear in 
practice as follows: if a(n) is non-stationary but d(n) is stationary the autocorrelation 
matrix and the cross-correlation vector are both time-varying. When x(n) and d(n) are 
both non-stationary, R(n) and p(n) are again time-varying. In applications, where the 
input sequence x(n) is stationary and the desired sequence d(n) is non-stationary, just 
the cross-correlation vector p(n) is time-varying and the matrix R has fixed coefficients. 
As a result, in all the above mentioned cases the optimum vector h,(n) has time-varying 
coefficients. 

Here we address the problem of time-varying system identification depicted in Fig. 
2.13, where h(n) = [h;(n),..., hy(n)]’ is a linear time-varying channel whose coefficients 
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Unknown 
system h(n) 


AdaptiveFilter 
h(n) 


Figure 2.13: Block diagram of time-varying system identification using an adaptive FIR 
filter. 


must be estimated by an adaptive FIR filter with the same number of coefficients h(n) = 
[ha(n),ha(n) Meh Jinn) x(n) is the stationary input sequence, d(n), e(n) and vu(n) are 
the desired sequence, the output error and the output noise respectively. 

We note that the difference between Fig. 2.2 and Fig. 2.13 consists in the fact that 
the unknown system does not have fixed coefficients but they are time-varying. However, 
the length of the unknown system here is assumed to be known and an adaptive FIR filter 
with the same length is implemented. Due to the fact that the input sequence x(n) is 
stationary and the desired sequence d(n) is obtained at the output of a time-varying FIR 
filter, the input autocorrelation matrix is time-invariant and the cross-correlation vector 
p(n) is time-varying. Therefore, we restrict our discussion to the applications where the 


non-stationarity appears into the desired sequence d(n), and (2.59) simplifies to: 


Rh,(n) = p(n). (2.61) 


The other situations, when the non-stationarity appears in the input sequence or in both 
x(n) and d(n), are left beyond the scope of this thesis. Although, we address here just 
this case, there are many practical applications in which the analytical results developed 
in this section and the algorithm introduced in the subsequent are of great interest. One 
example of such application is echo cancellation where the echo path can be modelled by 
a time-varying FIR filter. 
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With reference to Fig. 2.13, the noisy observation at time n is given by: 
d(n) = y(n) + v(n) = h’(n)x(n) + v(n). (2.62) 


where x(n) = [a(n),a(n —1),...,2(n — N + 1)]‘ is the tap input vector. 

To obtain the optimum coefficients vector h,(n) we express the cross-correlation vector 
p(7) in terms of autocorrelation matrix and vector h(n). Assuming that the output noise 
u(n) is zero mean and independent to x(n), (2.61) can be written: 


Rh,(n) = Rh(n) — h,(n) = h(n). (2.63) 


As a result, the optimum coefficients vector at each time instant n equals the vector of the 
time-varying coefficients of the unknown system. The minimum output error is obtained 
in the ideal case when the vector of the adaptive filter coefficients equals h,(n) and it can 
be expressed as follows: 


e,(n) = h*(n)x(n) — hé(n)x(n) + v(n) = v(n), (2.64) 


From (2.64), the minimum MSE is obtained as follows: 


VU 


Inn = Ble) = Elo (| So5. (2.65) 


The coefficients of the adaptive filter h(n) are modified to minimize the output mean 
squared error. When the LMS algorithm is used for adaptation, the update formula for 
the coefficients of the adaptive filter is: 


h(n +1) = h(n) + pe(n)x(n). (2.66) 
where jz is the step-size. 
In order to make the theoretical analysis more tractable, the following assumptions 
are commonly used in the open literature (see [28] and [41]): 


1. x(n) and e,(n) are zero-mean, stationary, jointly normal and with finite moments. 
Where e,(n) is the output error in the case of perfect adaptation (when h(n) = 
ho(n)). 


2. The successive increments e(n) = h(n + 1) — h(n) of the channel coefficients are 
independent to each other. However, the elements of e(n) might be statistically 
dependent for a given n. The sequence €(n) is zero-mean and stationary such that 
the covariance matrix of the filter coefficients increments Q = E [e(n)e'(n)] is time 


invariant. 
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3. h(n) is independent to e,(n) and x(n). This assumption is satisfied when the value 


of the step-size ys is small enough. 
4. e,(n), e(n) and x(n) are independent to each other. 
According to Assumption 2, we restrict our discussion to the tracking of time-varying 
systems given by the following model: 
h(n + 1) = h(n) + €(n), (2.67) 


with e(n) being the N x 1 vector of the channel increments. 

However, the time-varying model in (2.67) is not the only model which satisfy the 
Assumption 2. It was shown in [25] that also the Markov model h(n + 1) = ah(n) + €(n) 
satisfy Assumption 2 provided that the constant a is close to unity. 

Subtracting h,(n) = h(n) from both sides of (2.66) and using (2.67), the coefficient 


error vector is obtained as follows: 
Ah(n+1) = h(n+1)—h(n +1) = h(n) — h(n) — e(n) + pe(n)x(n), 
Ah(n+1) = Ah(n) — €(n) + pe(n)x(n) (2.68) 


The output error is obtained as: 


e(n) = y(n) —G(n) + v(m) = h'(n)x(n) — h'(n)x(n) + v(n), 
e(n) = hi(n)x(n) — hi(n)x(n) + hi (n)x(n) — h'(n)x(n) + v(n), 
e(n) = v(n) — Ah‘(n)x(n). (2.69) 


where y(n) = h(n)*x(n). 
From (2.68) and (2.69) it follows, that the coefficient error vector can be written in 
the following manner: 
Ah(n + 1) = Ah(n) — €(n) — ux(n)x'(n)Ah(n) + po(n)x(n) (2.70) 
The mean of the coefficient error vector is obtained taking the expectation operator in 
(2.70) and using the Assumptions 1 and 2: 
E {|Ah(n + 1)] = I- pR) £ [Ah(n)]. (2.71) 


Following a similar derivation as in [25], [28] and [41], it can be shown that for large n the 
left hand side of (2.71) converges to zero provided that the step-size satisfies the following 


condition: 


2 
OS Ss ; (2.72) 


where Amar is the maximum eigenvalue of R. 


At this point we should make two important remarks: 
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e The expected value of the coefficient error vector E [Ah(n)], for large n converges 
to zero in view of the Assumption 2 in which the increments e(n) of the unknown 
filter coefficients are assumed zero mean. If this assumption is violated, in the 
right-hand side of (2.71) a non-zero term E |e(n)| appears which prevents the vector 
E |Ah(n + 1)] from converging to zero. 


e The convergence condition of (2.72) is valid for a time-invariant autocorrelation 
matrix R. When the autocorrelation matrix is time-varying (due to the non- 
stationarity of x(n)), the stability condition of (2.72) becomes: 


Os es (2.73) 


2 
Dit (n) 
which means that the step-size 4 should be smaller than the inverse of the maximum 
eigenvalue of R(n) computed at each time instant n. 


The cross-correlation matrix C(n) = E [Ah(n)h'(n)], of the coefficient error vector can 
be computed multiplying (2.70) with its transpose and taking the expectation operator, 
as follows: 


C(n+ 1) E {[Ah‘(n + 1)Ah(n + 1)] = 
= C(n)+Q—pRC(n) — wC(n)R + p?o?R + p’Rtr [RC(n)] + 2u’RC(n)R. 


(2.74) 


For small value of the step-size jz the last two terms in (2.74) can be neglected (see [28}) 
and the cross-correlation matrix C(n + 1) becomes: 


C(n +1) = C(n) + Q— pRO(n) — wC(n)R + w?o?R, (2275) 

At the steady-state, for large values of n we have lim C(n +1) = lim C(n) and (2.75) 
can be written as in the sequel: 

uRC(co) + wC(oo)R = Q +4 p?o?R, (2.76) 


The steady-state mean square coefficient error defined as O,; = tr [C(co)] is obtained by 
pre-multiplying (2.76) with R~! and taking the trace, as follows: 


20 st =tr [R-'Q| + postr [I] ’ 


1 dl 
Ou = 5 [noe + tr [RQ (2.77) 
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where we have used the fact that tr [AB] = tr [BA]. 
Using Assumptions 1 to 4 and following a development similar to [28], [41], which is 
not detailed here, the steady-state MSE is obtained as follows: 


Jot = Imin + tr [RC(n)}, (2.78) 


where Jin is the minimum MSE in the case of perfect adaptation and it is expressed by 
(2.65). 

Finally, after some mathematical manipulations, the steady-state MSE can be explic- 
itly expressed by the following formula: 


Je = O02 + : uote [R] + al 
2 Hl 


(2.79) 

The steady-state MSE given by (2.79) is a nonlinear function of the step-size jz and it 
possesses a minimum (see [28] and [41]) which corresponds to an optimum step-size [Jopt. 
The value of fioy¢ can be computed taking the derivative of (2.79) with respect to js equal 


to zero. 


(2.80) 


Also the steady-state value of the mean square coefficient error O,; expressed by (2.77) 

is a nonlinear function of the step-size w and its minimum is obtained for: 

= 

= (2.81) 
In conclusion, the steady-state MSE and the steady-state mean square coefficients 
(MSC) error, for a time-varying system identification are nonlinear functions on the step- 
size pt. This effect is different from the case of a time-invariant system identification 
where the dependency between the steady-state MSE and steady-state MSC are linear 
functions on the step-size . As a consequence, for applications in which minimization 
of the output error is of primary interest, a step-size close to yuj;;° should be used. On 


the contrary, when the minimization of the coefficient error represents the main goal, 


the step-size used in the adaptation process must be close to se . Another interesting 
conclusion is that the class of Variable Step-Size LMS algorithms which were introduced 
based on the analytical results obtained for a time-invariant environment might not be 
suitable for time-varying systems. The VSSLMS which were derived based on the linear 
dependence between the steady-state MSE and the step-size might not give good results in 


time-varying environments in the sense that if the step-size is decreased as the algorithm 
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goes close to the steady-state the MSE may actually increase. However, the optimum 


mse 
opt 


step-size p*° can be accommodated in the VSSLMS such that the speed of convergence 
is increased while maintaining a small output missadjustment. 

In order to use (2.80), for computation of the optimum step-size, one needs to know 
tr [Q], tr [R], and 02. Although the trace of R. can be estimated during the adaptation, 
the noise variance and the trace of Q are not known in practice. The common method in 
the present literature is to estimate the channel parameters and the channel noise prior to 
adaptation and to compute the optimum step-size using (2.80). The problem is when the 
channel statistics cannot be easily and accurately estimated or when they change during 
the adaptation. In such cases, the optimum step-size is impossible to be estimated in 
advance. Iterative methods are therefore necessary which adaptively changes the step- 
size toward the optimum, such that the steady-state MSE or steady-state MSC error are 
minimized. An iterative algorithm for step-size adaptation toward ys° is introduced in 


opt 
Section 2.3 of this thesis. 


2.1.4 Simulations and results 


At this point, we will perform some computer simulations with the aim to verify the 
analytical results of (2.77) and (2.79). To this end, we have implemented an adaptive 
FIR filter in time-varying system identification framework as depicted in Fig. 2.13. The 
length of the unknown system h(n) was N = 4 and the length of the adaptive filter h(n) is 
equal. The coefficients of the unknown system were modelled as in (2.67) and the elements 
of e(n) were zero mean random Gaussian-distributed with identical variances o? = 107°, 
such that the cross-correlation matrix equal Q = o2I with I being the identity matrix. 
The input sequence x(n) used in the simulations was given by the following model: 


x(n) = ax(n — 1) + n(n). 


where a = 0.75 and 7(n) was zero mean Gaussian distributed with variance chosen, such 
that the variance of x(n) was unity. 
The optimum step-sizes p7p;° and ik which minimize the steady-state MSE and 0. 
are obtained from (2.77) and (2.79) as follows: 

OPS vid’ 00071 


opt 


We have conducted several experiments using the above system setup, and for each 
experiment the step-size used to update the adaptive filter coefficients was constant. 
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Figure 2.14: Steady-state excess MSE 
vs. the step-size for a time-varying sys- 
tem identification: experimental results 
(continuous line) and theoretical results 


(dashed line). 
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Figure 2.15: Steady-state mean square co- 
efficient error vs. the step-size for a time- 
varying system identification: experimen- 
tal results (continuous line) and theoreti- 
cal results (dashed line). 


However, the step-sizes used in different experiments were not equals. For each experiment 
a number of 100 independent runs of 10 iterations were performed and the results were 
averaged. In every experiment the steady-state MSE and the steady-state mean square 
coefficient error are computed and they are plotted in Fig. 2.14 and Fig. 2.15 respectively 
together with their theoretical values obtained by estimation of (2.77) and (2.79) for 
different step-sizes. 


A good agreement between the theory and practice can be observed in both Fig. 2.14 
and Fig. 2.15. 


2.2 Variable step-size Least Mean Square algorithms 


As we outlined in Section 2.1, in practical applications adaptive algorithms which possess 
high convergence speed while maintaining small convergence error are of great interest. 
For instance, in channel equalization during the transient period, the frequency char- 
acteristic of the adaptive equalizer is far from the inverse of the frequency response of 
the channel therefore, the data transmitted during this time will be corrupted. In echo 
cancellation application, if the coefficients of the adaptive canceler are not close to the 
coefficients of the FIR filter which models the echo path the resulting echo signal is not 
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attenuated. Actually, it is possible in this application, that during the transient period, 
the echo will be actually amplified. As a consequence, the transient period of the adap- 
tive filter must be as small as possible for most of the practical applications in order to 
improve the overall quality of the system. 


The Least Mean Square algorithm has a small computational complexity therefore, 
it is very simple to be implemented in practice. Although its simplicity, one of its main 
drawbacks is the fact that the speed of convergence and steady state error depends on 
the same parameter, the step-size yz. This conclusion was pointed out in Section 2.1.1 
equations (2.27) and (2.28), where we have seen that there is a direct dependence between 
the step-size and the misadjustment, while the convergence speed is inverse proportional 
with py. In conclusion, when a constant step-size is used in the LMS, there is a tradeoff be- 
tween the steady-state error and the convergence speed, which prevent a fast convergence 
when the step-size is chosen to be small for small output error. In order to deal with this 
problem, a simple idea is to use a step-size which is time-varying during the adaptation. 
At early stages of the adaptation, when the adaptive filter is far from the optimum, a 
larger value of the step-size should be used. This will shorten the transient period and 
increase the convergence speed of the adaptive filter. As the adaptive filter goes close to 
the optimum Wiener solution, the step-size should be decreased and so the misadjustment 
expressed by (2.26). The adaptive algorithms derived from the LMS, which uses time- 
varying step-size modified as described above, belong to the class of Variable Step-Size 
LMS (VSSLMS) algorithms. We should note that there are other adaptive algorithms 
with time-varying step-size, which we do not include in the class of VSSLMS due to the 
fact that they use step-size adaptation for other purposes, such as, finding the optimum 
step-size for time-varying environments. 

In this section, we emphasize on the class of VSSLMS which uses time-varying step- 
size to improve the convergence speed and also to reduce the tradeoff between the mis- 
adjustment and the convergence time. Other algorithms with time-varying step-size are 


discussed in subsequent sections. 


2.2.1 Existing approaches 


We review here some of the most cited algorithms from the class of VSSLMS and after that 
in the next two sections we introduce two new VSSLMS algorithms. All the algorithms 
from this section are described with reference to Fig. 2.1. 
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The VSSLMS algorithm 


The VSSLMS algorithm first introduced by Kwong and Johnston in [48] uses the following 
update formula for the adaptive filter coefficients: 


h(n +1) = h(n) + p(n)e(n)x(n). (2.82) 


where h(n) is the N x 1 vector of the adaptive filter coefficients, x(n) is the vector of the 
past N samples from the input sequence x(n), (nm) is a time-varying step-size and e(n) 
is the output error. 


The time-varying step-size is also adapted as in the following equation: 


ui(n-+1) = ap!(n) + 7e%(n), 

Lmae if ae (as Litas 

w(n+1) = bin if p(n +1) < Pin, (2.83) 
p(n+1) otherwise. 


with 0 << a<1andy> 0 being some constant parameters and Umar and Umin being the 
upper and lower bounds of the time-varying step-size. 

The constant parameter (max Which is normally selected close to the instability point 
of the conventional LMS algorithm is used to increase the convergence speed, while the 
parameter [min is chosen to provide a good compromise between the steady-state misad- 
justment and the tracking capability of the algorithm. The parameter ¥ is used to control 
the convergence time and also the steady-state level of the misadjustment. The behavior 
of the step-size as described in (2.83) is the following: at early stages of the adaptation 
(when the coefficients h(n) are far from the optimum solution) the step-size is increased 
due to the large value of the output error. As the algorithm goes closer to the steady-state 
the value of e(n) decreases which decrease the step-size p(n). 

The following approximate analytical expression for the steady-state misadjustment 
of the VSSLMS algorithm was derived in [48]: 


Jon boat — 22min tr RI 
Fmin 14 4/1 — 2G -abtnin ty (R] 


M= (2.84) 


_ 


Clearly, the steady-state misadjustment depends on the parameter y and on the min- 
imum value of the MSE Jin. Since the speed of convergence of the algorithm depends 
also on the parameter y we can conclude that there is still a dependence between the 
misadjustment and the convergence time. Another drawback of this algorithm is the fact 
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that the steady-state misadjustment depends also on J,,;,. For instance in system iden- 
tification applications, the minimum MSE equals the output noise variance, therefore the 
steady-state misadjustment depends on the system noise. 


The robust variable step-size LMS algorithm 


In order to reduce the influence of Jini; in the steady-state misadjustment (see (2.84)), the 
Robust Variable Step-Size LMS (RVSLMS) algorithm was proposed in [1]. The RVSLMS 
uses an estimate of the autocorrelation between the output error at adjacent time instants 
e(n) and e(n—1) to control the step-size. The same objective, to increase the step-size at 
early stages of the adaptation and to decrease ju(n) when the algorithm approaches the 
steady-state, is followed. 

The following update equation is used for the step-size: 


p(n) = Bp(n—1)+(1— B)e(nje(n — 1), 


w(n+1) = ap'(n)+p(n), 
Umax if u(n Tr 1) > Umax) 
pnt+1) = igen: AE) PHO) =< pas, (2.85) 


p(n +1) otherwise. 


where the parameters a, 7, Umin and Umar are the same as those used in the VSSLMS 
algorithm from [48] and 0 < 6 < 1 is an exponential weighting factor which controls the 
quality of the estimation of p(n). 

The misadjustment of the RVSLMS was derived in [1] for both stationary and non- 
stationary environments. For stationary environment the steady-state misadjustment is 
given by the following expressions [1]: 
Veda La B) 
(oF) (2) 
Although the misadjustment of the RVSLMS still depends on Jinin, the parameter @ which 
is introduced in addition together with y controls the steady-state misadjustment. The 


M= 


tr [R]. (2.86) 


parameter @ can be chosen, such that, a small M is obtained while maintaining a large 


y which increases the convergence speed. 


The complementary pair LMS algorithm 


In order to obtain an algorithm which improves the convergence speed while maintaining 
a small steady-state error, in [61] the so called Complementary Pair LMS (CP-LMS) algo- 
rithm was proposed. The block diagram of an adaptive filter implementing this algorithm 
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Speed mode 
filter lhy(n) 


Accuracy mode 
filter y(n) 


Figure 2.16: The block diagram of the complementary pair LMS. 


is depicted in Fig. 2.16, where the speed mode filter hi(n) and the accuracy mode filter 
h» (n) represent two adaptive filters whose coefficients are adapted by the LMS algorithm 
with constant step-sizes 4, and fz respectively. The speed mode filter uses a large step- 
size and it is used to increase the convergence speed while the accuracy mode filter, which 
uses a small step-size, is implemented to obtain a small steady-state error. Actually the 
filter of interest is the accuracy mode filter and the other adaptive filter is used just to 
increase the speed as it will be more clear in the sequel. Although this adaptive filtering 
structure does not use a time-varying step-size in the update equation, we have chosen to 
include it in the class of VSSLMS because the coefficients of the accuracy filter are not 
adapted using only the step-size 2 but they are also adapted by j;. The coefficients of 
the accuracy and speed mode filters are updated as in the following equations: 


e The update equation of the accuracy mode filter coefficients: 


eer aT! 
a and 
L Sr hy (n), if J 
ho(n+1) = Wome 74 (2.87) 
j=l 
ho(n) + p2e2(n)x(n), otherwise; 


where Q is computed as follows: 
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omal > # L aO< rer a@ on 


0 otherwise; 


e The coefficients of the speed mode filter are updated as follows: 
hi(n +1) = hy(n) + prei(n)x(n). (2.89) 


where hi(n) is the coefficients vector of the speed mode filter, hs (n) is the coefficients 
vector of the accuracy mode filter, 4; and slg are the step-sizes of the speed mode 
filter and accuracy mode filter respectively; e1(n) and e2(n) are the errors of the 
speed mode filter and accuracy mode filter which are computed as follows (see Fig. 
2.16): 


ei(n) =d(n)—Yi(n) and e2(n) = d(n) — y(n). (2.90) 


As we can see from the equations (2.87), (2.88) and (2.89), the CP-LMS algorithm pre- 
sented in [61] consists in two adaptive filters that operate in parallel one with large step- 
size, called speed mode filter and another with small step-size, called accuracy mode filter. 
Both adaptive filters use a fixed step-size in the adaptation process and their coefficients 
are updated using the standard LMS algorithm for a number of T' consecutive iterations 
which represents the test interval. At the end of each test interval the local averages of 
the square errors of both adaptive filters are computed and the coefficients of the accuracy 
mode filter are updated accordingly. If the local average of square error of the accuracy 
mode filter for LZ consecutive test intervals is larger than the local average of the square 
error of the speed mode filter, the coefficients of the accuracy mode filter are updated 


with the coefficients of the speed mode filter. The reason of this update is the following: 
m+T m+T 

when S* e7(n) < S> e3(n), it means that the speed mode filter is closer to the optimum 

solution than h(n) and its coefficients should be used. As both h,(n) and hg(n) are close 

to the optimum solution, the accuracy mode filter will perform better than hj (n) and its 


coefficients are adapted using the LMS with a small and fixed step-size. 


2.2.2 Complementary Pair Variable Step-Size LMS algorithm 


In the CP-LMS algorithm, the coefficients of the accuracy mode filter are re-initialized 
with the coefficients of the speed mode filter all the time when the local sum of the square 


error €;(7) is less than the local sum of the square error €2(n) for L consecutive test 
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intervals. This is because in that case the coefficients h,(n) are closer to the optimum. 
This observation can be extended to the step-size as well. For a training interval in which 
the speed mode filter performs better than the accuracy mode filter not only its coefficients 
are closer to the optimum but also its larger step-size is a better choice. As a consequence, 
the step-size of the accuracy mode filter must be increased. On the contrary, when the 
accuracy mode filter is closer to the optimum solution, its step-size is decreased in order 
to obtain a desired level of the steady-state missadjustment. The new algorithm, called 
Complementary Pair Variable Step-size LMS (CP-VSLMS) uses the idea of the CP-LMS 
algorithm proposed in [61], but the difference between the two algorithms consists in the 
fact that [2 is time-varying. 

In the case of the CP-VSLMS algorithm, the coefficients of the filter with step ju2(n) 
are re-initialized in the same way as in the CP-LMS algorithm when the local average 
of the e3(n) is larger than the local average of the e?(n) . In the same time, the step 


fr + p2(n) 
2 


Jlo(n + 1) is changed to the value , which increases the convergence speed of 


the algorithm. When the filter with step jz2(n) is closer to the optimum, the step p2(n) is 
decreased in order to obtain a small steady-state error. As a consequence, the CP-VSLMS 
algorithm can be described by the following steps: 


1. Adaptation of the speed filter coefficients: 
hi(n +1) = hy(n) + prei(n)x(n), (2.91) 
where e1(n) = d(n) — y(n). 


2. Adaptation of the coefficients ho(n): 


(aaa WS Es Nera 
Coen if ce 
A - i(n+ 1), 1 L 
ho(n +1) = lam “i)=1 (2.92) 
t=1 
ho(n) + pe2(n)e2(n)x(n), otherwise. 
and €2(n) = d(n) — yo(n). 
3. The re-initialization of the variable step: 
geal ey a ges Senaee 
fur + fl2(n) . . and 
fio(n+1) = 2 ’ ll Q(n—iT) =1: (2.93) 


i=1 
max {apl2(n), 3}, otherwise 
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where: 
m+T m+T 
Q(m) — di, if d. e1(2) > d. €5(4); (2.94) 


0, otherwise. 


The filter of interest is hy(n) and its step-size is increased when it converges slowly 
than hy (n). The step-size 2(n) is decreased when the filter hy (n) is near the steady-state. 
The coefficients of the adaptive filter hs (n) are re-initialized as in the CP-LMS algorithm. 
The minimum value of the step f2(n) is obtained when the algorithm is at the steady- 
state and it is close to #3. The maximum value of fl2(n) is obtained when the algorithm 
is far from the steady-state and can be in some cases very close to {4;, but always will be 
smaller than j11. 

The parameters j1; and ju3 which are the upper and the lower bounds of p(n) must 
be chosen to ensure the convergence. Moreover, if 4; is close to the value of p3, the 
number of changes in the step-size j12(n) is small. In order to provide a large number of 
modifications of f2(n) we must choose ju3 < fla. As a result, the parameters jz; and pus 
must be chosen to satisfy the following condition: 


3 K py < (2.95) 


2 
3¢r [R] 
where R is the input autocorrelation matrix. 

The value of the parameter a in equation (2.93) must be in the interval (0,1), such 
that the step ju2(n) is decreased when hy (n) performs better than hi(n). The parameter 
L, that is the number of consecutive test intervals, where the sum of the square errors are 
computed, is used in order to avoid missadaptations of the step-size and the coefficients. 
The convergence speed of the CP-VSLMS algorithm can be modified by 4, a and T’ and 
the steady-state level of the misadjustment is obtained selecting the value of pus. 


Theoretical analysis 


Is is easy to show, writing the Wiener-Hopf equations, that both adaptive filters converge 
to the same optimum solution given by the following equation: 


h, = R°!p. (2.96) 


where R is the input autocorrelation matrix and p is the cross-correlation vector between 
the desired sequence and the input. 

To examine the behavior of the coefficients of the CP-VSLMS algorithm we analyze 
both algorithms during one test interval. Suppose, that the analysis is made on the interval 
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from (k —1)T to kT and at (k —1)T the coefficients ha((k — 1)T) were re-initialized with 
hi((k —1)T). During one test interval the adaptive filters have constant step-sizes 4, and 
fio((k — 1)T). At the end of the test interval, the average of the coefficient error vectors 
are obtained as follows (see Section 2.1.1): 


E | Ah, (k7)| = [[-RJE | Ah, (kT = )] 
E | Aha (kT)| = [I —po((k—1)T)RJE | Ah (kT = | (2.97) 


where k is an integer. 
Writing the eigenvalue decomposition R = QAQ! of the matrix R equation (2.97) 


can be written as follows for every coefficient: 


Blu,(kP)) = -mAI Blu ((k-DD)), i=TN 
Blvo(KT)] = [L—po((k-YT)AJ Bl, ((k-YT)), i=TN. (2.98) 


where v2(kT’) = QAho(kT), vi(kT) = QAhi (KT), \; is the i eigenvalue of R. and v;,(n) 
and v9,(n) are the i” elements of vi(n) and vo(n) respectively. 

From (2.98) it is clear, that the convergence of the coefficients of the speed mode filter 
to the optimum is much faster than the convergence of hy (rn) due to the fact that the step- 
size [2(n) is less than jz; for all time instants n from (k —1)T to kT and v1, ((k — 1)T) = 
ve, ((k — 1)T)? for all i. 

The MSE at the end of the test interval can be obtained taking into account that the 
step-sizes [11 and jl2(n) are constant for n = (k — 1)T, kT, as follows: 


TAkT) = Imin + tr [RC1(kT)] 
Jo(kKT) = Jmin + tr [RC2(kT)] (2.99) 


where the minimum MSE for both adaptive filters is the same Jmin = lim E [e2(n)] = 
lim E [(d(n) — h,x(n))*| and the cross-correlation matrices of the coefficient error vectors 
are expressed as C,(n) = E [Ahi (n) Ahi (n)| and Cs(n) = E | Aha (n) Aj (n)]. 

In [20] p. 156, the following approximate expression for the transient MSE of the LMS 
with fixed step-size is given. 


N 


I(n)  Imin + S— di (1 = pd4)?" v7 (0) (2.100) 


i=l 


2This is due to the fact that ho ((k-1)T) = hy ((k-—1)T), which implies v1 ((kK-1)T) = 
vo ((k— IT) 
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where ; is the i*” eigenvalue of the input autocorrelation matrix, v;(0) is the i” element 
of the vector v(0) = Q’ /h(0) — h.| and the columns of Q represent the eigenvectors of 
the input autocorrelation matrix. 

Taking into account that during one test interval the step-sizes of both adaptive filters 
are constant, the following equations can be obtained from (2.100): 


A(kD) % Jmin + Sd (1 — pds)” vf, ((k- 1), 
Jo(kKT) % Imin + ¥— da (1 = pa ((& — 1)T) Aa)”* 05, ((k - LT) (2.101) 


i=1 


where ; is the i” eigenvalue of R, v1, ((k — 1)T) is the i element of vy ((k —1)T) = 
Q‘Ah, ((k — 1)T) = hy ((k — 1)T)—h, and ve, ((k — 1)T) is the i** element of v2 ((k — 1)T) = 
Ahy ((k — 1)T) = hy ((k — 1)T) — h,. 

We assume that at the beginning of the test interval the coefficients hy ((k — 1)T) are 
re-initialized with hy ((&k — 1)T), therefore we have: 


v1, ((k — 1)T) = v9, ((k — IT) (2.102) 


Clearly, from (2.101) and (2.102) it follows that the MSE of the speed mode filter 
h, (n) is smaller than the MSE of the adaptive filter ho(n) at the end of the test interval’. 

The above analysis was made for a test interval (n from (k — 1)T to kT) with the 
assumption that the coefficients hy (n) are initialized with h, (n) at n = (kK —1)T. If at 
the beginning of the test interval, the coefficients h(n) are not re-initialized, the same 
analysis can be extended for two or more consecutive intervals. Let us consider that at 
n = (k —1)T the coefficients hy (n) are not re-initialized, but they are re-initialized at 
n = (k —2)T. The average of the coefficients errors at time instant n = kT can be 
expressed as follows: 


Elu(kT)) = [L-mA) Ela (k-2)7)], i= LN 
E[v2,(kT)| = [1 — pwo((k — 2)T)AJ* [1 — wo((k — DT )A]" EB [v2,((k — 2)7)], 
(a1. (2.103) 


3In Section 2.1.1, we have concluded that the steady-state MSE decreases when the step-sizes is 
decreased. Here we discuss the behavior of the MSE after a number of T consecutive iterations during 
the transient period of the adaptive filters. The conclusion is that a larger step-size will give smaller 
transient MSE, which is intuitively correct since larger step-size means faster convergence to the optimum 


solution. 
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Since B [v,((k—2)T)] = E [vy,((k — 2)T)] and pry ((k — 2)T) < pa ((b- IT) < yur it 
follows that E [v1,(kT)| < E [vo,(kT)], for every coefficients of the adaptive filters. 

The transient MSE of both adaptive filters, at time instant n = kT can be approxi- 
mated by: 


I(kT) & Inman + di (1 — pdx)” v1, ((k — 2)T) 


Je(KT) % Ipin + Ys (1 = fla (( = IY) Wa)” (1 = po ((ie — 27) 4)? 09, (B- 27) 
(2.104) 
As a consequence, the speed mode filter has a smaller transient MSE than the accuracy 
mode filter hy (n). 

From the above analytical results we can conclude that the transient MSE of the 
speed mode filter is smaller than the MSE of hy (rn) at the end of every test interval. This 
conclusion justifies the method for step-size update in (2.93) and the re-initialization of 
the accuracy mode filter coefficients from (2.92). 

Of course, when the speed mode filter is at steady-state, its MSE can be approximated 
by: 

ee, (1 f tr (R]) (2.105) 


Since the speed mode filter, converges faster than ho(n), the step-size [12(n) will only be 
decreased after h(n) has converged. Finally, the accuracy mode filter converges to the 
following level of the MSE: 


Jo. = Imin (1 + Str (R]) ; (2.106) 


obtained when j19(n) attains its minimum bound. 


2.2.3. Noise Constrained Variable Step-Size LMS algorithm 


The inconvenience of the CP-VSLMS algorithm described in the previous section is its 
increased computational complexity, as compared with the VSSLMS and RVSLMS algo- 
rithms of [1] and [48], due to the use of two adaptive filters which operate in parallel. 
In this section we introduce a variable step-size LMS algorithm that exploits the infor- 
mation about noise in order to obtain a fast convergence and a small steady-state error. 
The proposed filtering structure is introduced mainly for system identification applica- 
tions and it uses just one adaptive filter such that the computational complexity is highly 
decreased. The block diagram of an FIR adaptive filter implementing the proposed Noise 
Constrained Variable Step-Size LMS (NCVSLMS) algorithm for system identification is 


2.2 Variable step-size Least Mean Square algorithms 49 


depicted in Fig. 2.2. In this figure, h is the unknown system modelled as an FIR filter of 
length N, h(n) is the adaptive FIR filter, x(n) is the input sequence, y(n) and y(n) are 
the outputs of the unknown system and the adaptive filter respectively. 


The coefficients of the adaptive filter are updated by the following equation: 
h(n +1) = h(n) + p(n)e(n)x(n). (2.107) 


where e(n) = y(n) —9(n) + v(n) is the output error, y(n) = h?(n)x(n), #(n) = h7(n)x(n) 
and v(n) is the output noise. 


To update the variable step-size (nm) we propose the following formula: 


medr [e?(n)| > C, 


= if and 
id es ieee 
pln 1)= med [e?(n)] < C, (2.108) 
max {ap(n), min} if and 
n=T,2T,... 
p(n), otherwise 


where [min ANd Umax are the minimum and the maximum bounds of the step-size and 
medr [e?(n)] is the median of [e?(n),e?(n — 1),...,e?(n —T + 1)}. 

The behavior of the step-size of the proposed algorithm can be described as follows. 
For a number of T consecutive iterations (the test interval of length JT’) the step-size 
y(n) is constant and the coefficients of the adaptive filter are updated as in the case of a 
fixed step-size*. At the end of the test interval, the median of the square output error is 
computed and compared with a threshold C’. If the median is larger than this threshold, 
this means the algorithm is on the transient period therefore, the step-size is increased to 
obtain a faster convergence. On the contrary, when the median is comparable with the 
threshold, it means that the algorithm is already at the steady-state and the step-size is 
decreased in order to obtain a smaller level of the misadjustment. The minimum value of 
the step-size is [4min and it is attained when the adaptive filter is at the steady-state. 

As we can see from the above description of the algorithm, the main idea here is to 
increase the step-size y(n) during the transient period and to decrease its value at the 
steady-state. As a consequence, the threshold C must contain information which allows 
us to test when the adaptive filter is at the steady-state. To this end we propose to use 
(2.24) which approximates the steady-state MSE in the case of a fixed step-size and the 


4Since y(n) is constant during a test interval, (2.107) is equivalent with (2.12) for the standard LMS 


algorithm for T consecutive iterations. 
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threshold is given by: 
£59 (nr) 7 9 
C(n) =o5 (1+ 9 Noe (2.109) 


where o? is the variance of the output noise and o? is the variance of the input sequence 
The threshold C(n) as described by (2.109) is not constant during the adaptation 
process but it is computed all the time the step-size is changed. This is because the 
steady-state MSE depends on the value of the step-size. 
For the values of the step-sizes [lynn aNd [max We can choose any value as long as the 
convergence is obtained. In order to increase the convergence speed, the step-size [max 


must be chosen close to the stability boundary The convergence speed depends 


2 
3No2- 
also on the length T of the test interval. If Tis too large, the algorithm converges inside 
the test interval and the step-size is not enough times updated. The value of a must be 
in the interval (0,1) such that, when the algorithm goes to the steady-state, the step-size 
is decreased. 

At the steady-state, the step-size u(n) converges tO [lmin, therefore the misadjustment 


of the algorithm can be approximated as follows: 


2 
nin o; 


A 5 


(2.110) 


In the step-size adaptation (see (2.108) and (2.109)), an accurate estimation of ¢? 
and o? are necessary. In some practical applications, information about noise variance is 
available by modeling or measuring the noise [52], [77]. The variance of the input sequence 
can also be estimated. An equivalent of this algorithm in which there is no necessity to 


compute the input variance is described in the next chapter. 


2.2.4 Simulations and results 


In this section, the above mentioned algorithms are implemented in system identification 
framework depicted in Fig. 2.2. The unknown system has N = 10 coefficients and all 
the tested adaptive filters have equal lengths. The parameters of the algorithms were 
chosen to give comparable levels of the misadjustment. More than that, the selection of 
the parameters was done using the guidelines from the corresponding papers and they 
are shown in Table 2.1. For benchmark purposes, in addition to the variable step-size 
LMS algorithms, in our simulations we have also included two LMS algorithms with fixed 
step-size denoted as LMS1 and LMS2. The first algorithm LMS1 has a large step-size 
whereas LMS2 has a small step-size chosen to obtain the same level of the misadjustment 


2.2 Variable step-size Least Mean Square algorithms 51 


LMS1 p= 0.05 
LMS2 p= 0.002 
VSSLMS tman = 0.05, fémin = 0.002, a= 0.97, 7 = 0.057 
RVSLMS | timar = 0.05, fanin = 0.002, a=0.97, B=0.99, y=1 
CP-LMS i =0.05, pg = 0.002, T=100, E=1 
CP-VSLMS | 4, = 0.05, pmin = 0.002, T=100, a=06, L=1 
NCVSLMS imax = 0.05, fmin = 0.002, T=100, a=0.6 


Table 2.1: The parameters of the compared algorithms. 


as the VSSLMS algorithms. The output noise v(n) was zero mean Gaussian-distributed 
with variance o? = 107°. The input sequence was also zero mean Gaussian-distributed 
with unity variance. Results are obtained by averaging over 100 independent runs, each 
run containing 8 x 10° iterations. The learning curves for the compared algorithms are 
shown in Fig. 2.17 and Fig. 2.18 where we have plotted the mean square coefficient error 
defined as: 


MSC(n) = Y> (fu(n) — hi) (2.111) 


where h; is the i” coefficient of the unknown filter h in Fig. 2.2 and h,(n) is the it” 


coefficient of the adaptive filter h(n) at time instant n. 


In the case of CP-LMS and CP-VSLMS, there are two filters working in parallel, the 
speed mode filter hi(n) and the accuracy mode filter ho(n) and in both cases the filter 
of interest is ho(n). Therefore, for these two filters (2.111) was evaluated using ha(n) 
instead of h(n). 


From the learning curves shown in Fig. 2.17 and Fig. 2.18 we can see that CP- 
VSLMS has the best performance among the compared algorithms. Of course, LMS1, 
which uses a large step-size has the faster convergence but at the expense of an increased 
steady-state MSC. The CP-LMS has the slowest convergence among the variable step- 
size LMS algorithms as we can see from Fig. 2.18. Anyway, when the step-size of the 
accuracy mode filter is made time varying, the speed of the new CP-VSLMS algorithm 
is highly improved. The performances of both proposed algorithms, the CP-VSLMS and 
NCVSLMS are comparable. 
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Figure 2.17: The mean square coefficient error for LMS1, LMS2, CP-VSLMS and 
NCVSLMS. 


2.2.5 Comparison of the variable step-size LMS algorithms 


In this section, we compare different Variable Step-Size LMS algorithms in terms of com- 
putational complexity, memory load and simplicity of implementation. Also the advan- 
tages and the drawbacks of the compared algorithms are discussed together with some 
guidelines for practical implementation. 

To this end, in Table 2.2 the memory load and computational complexity of the 
VSSLMS, RVSLMS, CP-LMS, CP-VSLMS and NCVSLMS are shown. Among all al- 
gorithms, the CP-LMS and CP-VSLMS have the largest complexity. However the setup 
of their parameters is very simple (the minimum bound of the step-size [min can be 
obtained from (2.106) for a desired level of the misadjustment). 

We note that, in the case of NCVSLMS, in the formula for step-size update, the noise 
variance o? is needed. As a consequence, this algorithm is more suitable in applications 
where o? can be approximated. Actually, the same discussion is valid also for VSSLMS 
and RVSLMS algorithms if we look at (2.84) and (2.86). These two equations are used 
in order to setup the value of the parameter y for both algorithms and in both equations 
the value of the minimum MSE is included. As a consequence, the parameter y computed 
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Figure 2.18: The mean square coefficient error for VSSLMS, RVSLMS, CP-LMS and 


CP-VSLMS. 


to obtain a desired misadjustment depends upon Jmin. If Jmin increases, the value of + 


should be decreased to maintain the same output error. This problem can be avoided in 
the case of VSSLMS and RVSLMS if the 7 is chosen to have a very small value, such that 


the same misadjustment is obtained for different noise levels’. 


The advantages and disadvantages of the compared algorithms are synthesized in Table 


2.0: 


VSSLMS | RVSLMS | CP-LMS | CP-VSLMS | NCVSLMS 
Memory load 2N+9 2N+12 3N+11 3N+14 2N+10 
Add. and sub. 2N+1 2N+3 4N+2 4N+3 2N+3 
Multip. and div. 2N+4 2N+7 4N+5 4N+6 2N+6 


Table 2.2: The complexity of the compared algorithms. 


5For small output SNR and very small value of y, the steady-state value of the step-size equals the 


minimum bound fimin. AS a consequence, the misadjustment can be computed similar to (2.104). If Jmin 


increases, the misadjustment is computed with (2.84) and (2.86) respectively. 
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VSSLMS | RVSLMS | CP-LMS | CP-VSLMS | NCVSLMS | 

Complexity small small large large small | 
Speed fast fast slower faster fast | 
Setup needs Jini, | needs Jinin | simple simple needs Jinin | 


Table 2.3: Advantages and disadvantages of the compared algorithms. 


2.3. Variable length LMS algorithms 


In some practical applications, such as, system identification, the goal is to obtain good 
approximations of the coefficients of an unknown system. To this end, the LMS algorithm 
or some modifications, as those discussed in the previous section, can be very easily applied 
with excellent results. However, in the existing implementations almost all authors use 
an adaptive filter that has length equal to the length of the unknown system. Therefore, 
the optimum length has to be a priori known or it is truncated to some predefined value. 
There are just few implementations of the variable length LMS adaptive filters [69], [80], 
but these implementations do not really find the optimum filter length. For instance, 
in [69], the authors propose a Variable Length LMS in which the length of the filter is 
increased as the algorithm goes to its steady-state. The same behavior of the adaptive 
filter length is presented in the algorithm proposed in [80], although the modification of 
the filter length is based on some other formulas. In the above mentioned papers, some 
maximum length Nima: is imposed for the adaptive filter and this length is obtained at 
the steady-state. Actually, by implementing a variable length for the LMS algorithm 
the authors in [69] and [80] wanted to improve the speed of convergence of the adaptive 
algorithm. Here we refer to the length adaptation from another point of view, namely 
we are interested to approximate also the correct length of the adaptive system. In 
Section 2.1.2, the analytical MSE was obtained as a function of the adaptive filter length. 
Based on that analysis, an algorithm that finds the optimum coefficients together with 
the optimum filter length is derived here. 


2.3.1 The proposed algorithm 


We addres the problem of system identification where we are interested not only in finding 
the correct values of the filter coefficients but also to approximate its correct length. Our 
proposed Variable Length LMS (VLLMS) algorithm is based on the analytical results 
from Section 2.1.2 where we have shown that, the steady-state MSE is smaller when 
the length Naq of the adaptive filter is close to the length of the unknown system N 
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Figure 2.19: Block diagram of the VLLMS for system identification. 


and Nag < N. On the other hand, if Nag > N then the steady-state MSE does not 
depend on the filter length. This conclusion is valid for an uncorrelated input sequence 
x(n) and it is generally not valid for correlated input signals. The block diagram of the 
proposed VLLMS algorithm is presented in Fig. 2.19 where hi(n), hy (n) and hs (n) are 
three adaptive filters working in parallel. The lengths of these filters are N,, No and N3 
respectively with N, < Ny < N3. The proposed algorithm can be summarized as follows: 


1. Initialization: N,(0) = No, N2(0) = No + 1, N3(0) = No + 2, 111(0) = 


_ No = 
f12(0) = Na(0)’ 113(0) = (0): 


2. At every time instant n, compute the output errors e)(n) = d(n) — hj(n)xi(n), 
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Vl = 1,2,3 and update the coefficients of the adaptive filters: 
hy(n +1) = hj(n) + pu(n)x(n)er(n), | = 1,2, 3. 
where x(n) = [x(n), a(n —1),...,a(n— N, +1)’. 


3. For every L"" iteration (where L is an integer parameter) do: 


1 n 
e compute the following averages: m; = Z S- a Nia: 
j=n—-L41 


e update the lengths: 


Ni(n) +1, if mi > me > mz, 


N. n+ 1 — N- n 1 dL, 
Ni(n+1) = ¢ Ny(n), if m1 > me < mz, oa i? fs 7 aa i ; 
Ni(n)—1, otherwise. 3 1 , 
(2.112) 
UNo 


e update the step-sizes: ju;(n + 1) Ved, 2-3: 


~ Nin+1)’ 


The parameter L in the above algorithm has to be large enough, such that the MSE 
can be approximated with the average of past L square errors, and also it has to be small 
in order to have a sufficient number of updates. Actually, in our simulations, we have 
obtained good results using a variable parameter L(k) chosen to be a multiple of the time 


constant T: 


Lo) = [Parese tad = [gael cae 


where [A] represents the integer part of A and P is an integer parameter. Note that, the 
value of L(n) is changed together with ji9(n). 

The specifics of the new algorithm are: the lengths Ni(k), No(k) and N3(k) are not 
changed at each iteration, but are constant for a number of L(k) iterations and after that, 
based on the estimated mean square errors, the lengths are changed according to (2.112). 
When the lengths are decreased, the last coefficient from each of the coefficients vector 
is simply eliminated. When the lengths are increased, the new coefficient added to each 
vector hy (k), ho(k) and h3(k) is initialized with zero. When all the lengths are smaller 
than the optimum length, then m; > m2 > mz3 and all the lengths are increased by one as 
shown in (2.112). If the length N; is smaller than the optimum length and the others are 
equal or larger than the optimum length we have just m , > mg. In this case, the lengths 
are left unchanged. Finally, if m, < mz < ms (ideally they are all equals), it means that 
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all the lengths are larger than the optimum length and they are decreased. The second 


adaptive filter is the filter of interest in the sense that its coefficients vector will be the 


closest one to the optimum Wiener filter and its length is closer to the optimum. 


We emphasize, that the step-sizes of all three adaptive filters satisfies the condition 


i(n)Ni(n) = pe(n)No(n) = us(n)N3(n) at every time instant which ensures the same 


misadjustments of the algorithms. Also due to this condition, the length update can be 


done using (2.112). 
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Figure 2.20: The length for the second 
adaptive filter (N2(n)) during one run. 
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Figure 2.22: The MSE of the second adap- 
tive filter. 
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Figure 2.21: The average length for the 
second adaptive filter (E { No(n)}). 
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Figure 2.23: The MSE of an adaptive filter 
with length N = 19. 
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2.3.2 Simulations and results 


The new algorithm was tested in the system identification framework. The length of the 
unknown system was N = 19. The lengths of the adaptive filters were initialized with: 
N,(0) = 7, No(0) = 8 and N3(n) = 9 respectively. The step-size jv was initialized with 
10~? which satisfy the condition for convergence in the mean square sense and the number 
of time constants in (2.113) was P = 2. We note that due to the fact that the step-sizes of 
all adaptive filters satisfy the condition j11(n)Ni(n) = p(n) No(n) = u3(n)N3(n) at each 
iteration, the convergence of all adaptive filters is ensured when p = 107?. 

The input signal x(n) was white Gaussian random with zero mean and variance o? = 1. 
The output noise was random Gaussian distributed with zero mean and variance a? = 
LO: 

The length behavior of the second adaptive filter during one run is depicted in Fig. 
2.20 and the expected value of the No(n) during the adaptation is depicted in Fig. 2.21. 
The expected value of No(n) in Fig. 2.21, was computed by averaging the results of 100 
independent runs. We can see from these figures, that the length of the second adaptive 
filter converges close to the length of the unknown filter. 

In Fig. 2.22 and Fig. 2.23 the MSE of the second adaptive filter and the MSE of 
an adaptive filter with length equal to N = 19 are depicted. The convergence of the 
filter with adaptive length is slower than the convergence of the adaptive filter which has 
length equal to the optimum length, due to the length adaptation. At early stages of 
the length adaptation, the MSE is larger due to the fact that there are many coefficients 
of the unknown filter that does not have correspondent in the adaptive filter. When the 
length is close to the optimum the MSE decreases. 


2.4 Step-size adaptation in time-varying environment 


It is well known that, in the case of tracking a time-varying system, the steady-state MSE 
is a nonlinear function of the algorithm step-size. Moreover, there is an optimum step- 
size which minimizes the steady-state MSE, in a time-varying environment, as detailed 
in Section 2.1.3. There are many papers in the open literature, that study the behavior 
of different LMS based algorithms for tracking time-varying systems. However, to the 
best of our knowledge in the existing literature the computation of the optimum step- 
size is done making some assumptions about the parameters of the time-varying systems 
and the theoretical formulas (2.80) and (2.81) are used. Here, we introduce a simple 
adaptive algorithm which iteratively adjusts the value of the step-size toward the optimum, 
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such that the steady-state output MSE is minimized. The proposed algorithm uses the 
parameterization of the nonlinear function that gives the dependence between the steady- 
state MSE and the step-size. During the adaptation, the parameters of this nonlinear 
function are computed therefore, some estimates of the optimum step-size can be easily 
obtained without any prior information about the system parameters. The step-size of 


the proposed algorithm converges near to pj as it can be seen from the simulations 
shown at the end of this section. 
To introduce our algorithm, we refere to (2.79) from Section 2.1.3 and we make the 


following notations: 


1 
A SJ = O,, B= solr Rls C= 5tr [Q] 
(2.114) 
With these notations, (2.79) can be written as follows: 
C 


In (2.115) A, B and C are unknown. We only know the step-size ys and the steady- 
state MSE®. To estimate J,, during the adaptation process, we use the following iterative 
method: 


P(n) =aP(n—1)+(1—a)e*(i), J(n)=2 SD Pli) (2.116) 
i=n—L+1 
a € (0,1) being a constant parameter and L is the number of consecutive iterations on 
which J(n) is computed. 

At this point, we have to make a very important remark: in (2.116) the MSE is 
estimated during the adaptation process while J, in (2.115) is the MSE at the steady- 
state. This is why we have used the notation J(n) instead of J; in (2.116). We shall 
emphasize here that using J(n) instead of the steady-state MSE in (2.115), the optimum 
value of the step-size computed when the algorithm is far from the steady-state (n << co) 
are erroneous. For this reason, we do not compute the optimum step-size just once during 
the adaptation process but we compute it many times. When the algorithm goes close to 
its steady-state, the MSE estimate from (2.116) converges close to J,; and the value of 
the estimated optimum step-size based on (2.115) is close to wi;° given by (2.80). 

The parameters A, B and C' do not depend on the adaptive filter but only on the 
statistics of the input signal, output noise and the unknown filter statistics. For three 


® Actually we estimate the steady state J,, during the adaptation. 
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adaptive LMS filters with same length and different step-sizes 41, (42 and ju3, at the steady- 
state, three different MSE’s (Js,, Jst. and Js) are obtained. The nonlinear functions 
that give the dependence between these three steady-state MSE’s and the corresponding 


step-sizes are expressed as follows: 


C 
hak = A [4B 
by 
C 
Fie Ais (2.117) 
C 
Jes = A [3B 
M3 


The system of equations in (2.117) is linear in A, B and C. Its solution can be easily 


obtained as follows: 


Gin (Jigp23 — Fostti2) Mapas, Bz Jo3 | C 


H12/23/13 [23 H2H3 


(2.118) 


where Jig = Jeb, = gies Jo3 = J sti = ee Miz = M1 — Ha, #13 = Hi — 3 and 123 = 2 — 3. 


mse 
opt 


mse C 
= le: (2.119) 


Equations (2.117)-(2.119) are valid only when all three adaptive filters are at the steady- 
state. If they are used during the transient period of the adaptive filters, the estimated 


Finally, the optimum step-size y”"s° is estimated as follows: 


step-size in (2.119) is far from the optimum. Since the computation of the parameters A, 
B and C is done many times during the adaptation process when the three algorithms 
converges, also the value of the step-size computed using (2.119) converge to pr. 

The same block diagram as the one in Fig. 2.19 is used for the proposed algorithm. 
The difference is that the three FIR adaptive filters have the same lengths N and different 


step-sizes. Their coefficients are updated as follows: 
fi(n +1) = fin) +ner(n)x(n), 
ho(n +1) = hyn) + pa(n)e2(n)x(n), 
h3(n+1) = hg(n) + ps3(n)e3(n)x(n) (2.120) 


Note that in (2.120) two of the adaptive filters have time-varying step-sizes updated as 


follows: 


CC if n=L,2L,... 3 
pa(n +1) = yi fon=L 7 pig(n +1) =< po(n +1), (2.121) 
[l2(n), otherwise 
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Figure 2.24: Step-size behavior for the Figure 2.25: MSE of two adaptive filters 


new algorithm in the first case 02, = 02, = — with yu, = 0.1 and p(n) in the first case 
tere go 10." (S60. eee = 10 


where C' and B are computed as in (2.118). 

The proposed algorithm can be described as follows: three adaptive LMS filters with 
different step-sizes are used. All the filters perform independently and they have fixed 
step-sizes for a number of L consecutive iterations (test interval of length L). At the 
end of the test interval some estimates of the MSE for each filter are computed using 
(2.116) and based on a linear system of equations (2.117)-(2.119) an intermediate value 
of the optimum step-size is estimated. The step-size fi2(n) of the second adaptive filter 
is updated with the value of this estimated optimum step-size. The adaptations continue 
with the middle filter having the new step-size and after each test interval a new optimum 
step-size is computed and j19(n) is changed accordingly. The step-size 3(n) of the third 
adaptive filter is modified by (2.121). Choosing this formula to modify the step-size ju3(n) 
we ensure that always j2(n) > u3(n). Imposing also, that 1 > f2(n) at each time instant, 
by setting j1, close to the stability limit, the three algorithms will have different step-sizes 
and the system of equations (2.117) will always have a unique solution. 


2.4.1 Simulations and results 


The proposed algorithm was tested in the channel estimation framework as depicted in 
Fig. 2.19. The model of the channel used in our simulations is given by (2.67). The 
output noise v(n) was white Gaussian with zero mean and variance 0? = 25 x 10+. The 
length of the time-varying channel and the lengths of all adaptive filters were chosen equal 


to N = 8. The input sequence x(n) was white Gaussian with zero mean and variance 
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Figure 2.27: MSE of two adaptive filters 


Figure 2.26: Step-size behavior for the ; 
with yw, = 0.1 and with p(n) in the sec- 


new algorithm in the second case o?, = ‘ : = 5 
ond case 07, = 07, = 10° andoz, = +--+ = 


a 
o., =0 


o? = 1. For the first adaptive filter with fixed step-size we have chosen jz; = 0.1 and for 
the two adaptive filters with variable step-sizes they were initialized with ju3(0) = 0.05 
and [l2(0) = 0.075. The length of the test interval was chosen L = 100 iterations and the 
smoothing coefficient in (2.116) was a = 0.99. The simulation results are presented for 
two different cases. 

In the first case, every element of the vector e(n) in (2.67) was chosen from a random 
zero mean sequence with variance 0? = 10~°. In this case, Q = o7I, with I being the 


identity matrix. The direct computation of the optimum step-size using (2.80) gives: 


mse __ tr (Q) 


= 0.02 
opt ~ \I G2¢r [R] 


where tr [Q] = No? = 8 x 10~°, tr [R] = No? = 8 and o2 = 25 x 107+. 

The behavior of the step-size j19(n) during the adaptation is depicted in Fig. 2.24. 
The MSE’s of the adaptive filter with fixed step-size 4, and of the adaptive filter with 
variable step-size [2(n) are depicted in Fig. 2.25. As we can see from Fig. 2.24, during 
the transient period of the algorithms, the value of the step-size ji2(n) is far from the 
optimum. As the adaptive filters hy (n), h.(n) and h;(n) go to their steady-state the 
value of l2(n) converges to approximately Hop = 0.02. From Fig. 2.25 we can conclude 
that an adaptive step-size gives better performances in terms of lower steady-state MSE. 

In the second case, just the first and the last coefficients of the channel were time- 


varying (02 = 02, = 10-) and the rest of the coefficients were left unchanged (02, = 
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-= 02 =0). The optimum step-size obtained by direct application of (2.77) was: 


tr (Q) 


= 0.01 
o2tr [R] 


Hopt = 


where tr [Q] = 202 = 2 x 10-®. 

The behavior of the step-size f2(n) is depicted in Fig. 2.26 and the MSE’s of the 
adaptive filter with step-size 4, and of the adaptive filter with variable step-size p2(n) 
are shown in Fig. 2.27. From Fig. 2.26 and 2.27 we can see that the step-size j12(n) 
converges to the optimum /1p:. The adaptive filter with variable step-size j12(n) gives 
lower steady-state MSE than the adaptive filter with fixed step-size pu. 


2.5 Order Statistics Least Mean Squared algorithms 


The LMS suffers serious performance degradation and may fail when the input and/or the 
desired signals are corrupted by an impulsive noise. To overcome this difficulty, the class 
of Order Statistic LMS (OSLMS) adaptive filters was introduced (see [23], [32], [33], [39] 
and the references therein). In the case of OSLMS algorithms the coefficients of the 
adaptive filter are updated as in the following formula: 

h(n + 1) = h(n) + uO {g(n)}a, (2,199) 
where h(n) is the vector of the adaptive filter coefficients, ju is the step-size, a = [a1...a ra 
is a vector of weighting coefficients for smoothing the gradient, O {g(n)} is the ordering 
operation applied to each row of the matrix g(n), which is given by: 


e(n)a(n) et e(n—L+1)r(n-—L+1) 
wy e(n)a(n — 1) i" e(n —L—1)a(n —L +2) . (2.123) 
e(njta(n-N+1) ... e(n-—L—-l1)t(n-N-L+2) 


x(n) and e(n) are the input sequence and the output error, respectively. The class of 
OSLMS filters includes the following adaptive filters as particular cases: 


e the Average LMS (ALMS) with a = ; [1,1,..., 1]; 
e the Median LMS (MLMS) with a = [0,...,1,...,0]’; 


e the Trimmed Mean LMS (MxLMS) with a = [0, Se A cr o]” , where 


L is the length of the weighting vector and M is the number of samples eliminated 
from the left and right side of the ordered input sequence; 
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e the Outer Mean LMS (OxLMS) with a = [1/2,0,...0,1/2]*). 


As shown in [32], [33], [39], the OSLMS algorithms can reduce the variance of the gradient 
estimate if the weighting coefficients are chosen properly. This leads to a reduction of the 
steady-state excess MSE. It was already established in [33] that for impulsive environ- 
ments the MLMS and MxLMS algorithms have better behavior comparing to OxLMS and 
ALMS, whereas for Gaussian and uniform noise environments the optimum choices are 
the ALMS and OxLMS algorithms, respectively. However, selection of each of these algo- 
rithms has to be based on a priori knowledge of the noise distribution or, more generally, 
on the knowledge of the gradient distribution. Without this knowledge, an arbitrarily 
chosen filter may have poor performance. In this paper we propose a new AOSLMS algo- 
rithm that uses adaptive weighting coefficients a(n) for smoothing the gradient. A novelty 
of the new algorithm is the fact that no prior information about the gradient distribution 
is necessary. Some approaches that use the adaptation of the weighting coefficients a(n) 
based on some statistic measurements of the gradient have been reported in the literature 
(see, e.g., [32]), but they are limited to the modification of the trimming coefficient, such 
that the OS filter is modified between mean and median. 


2.5.1 The proposed algorithm 


The update equation (2.122) of the OSLMS filter is modified as: 


h(n +1) = h(n) + pO {g(n)} a(n); (2.124) 
where the notations are those from (2.122). 

Note that in (2.124) the values of the weighting coefficients a(n) are not constants 
during the adaptation, but are adapted to the gradient distribution. In order to adapt 
the coefficients a(n) we have implemented an L-LMS filter (see, e.g., [23] and [64]). It 
was already proved that these filters possess the ability to adapt their coefficients to the 
distribution of the input sequence. Due to the topology of the matrix g(n), the distribution 
of its rows will be similar therefore, the adaptation of a(n) can be done using samples of 
the gradient contained in the first row of g(n). The block diagram of the new AOSLMS 
algorithm for system identification is presented in Fig. 2.28 and the block diagram of the 
[-LMS filter for the gradient is depicted in Fig. 2.29. 

The new algorithm consists of the following steps. 


e Compute the output y(n) and the error e(n) of the AOSLMS filter (see Fig. 2.28): 


g(n) =h(n)*x(n), e(n) = y(n) + o(n) — G(n). 
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Figure 2.29: The block diagram of the 
[-LMS filter used for adaptation of the 
Figure 2.28: The block diagram of the weighting coefficients a(n) 
OSLMS for system identification when the 
noise appears at the input and output of 
the filters. 


e Update the weighting coefficients a(n) (see Fig. 2.29): 
a(n + 1) = a(n) + agg) (n)eg(n). (2.125) 


where g(o)() is the ordered version of the first row of g(n), and e,(n) is the error of the 
L-LMS filter applied to the gradient. 


e Update the coefficients of the OSLMS filter: 


h(n + 1) = h(n) + pO {g(n)} a(n). (2.126) 


In (2.125), we have used the error e,(n) for updating the weighting coefficients. Since 
in this case there is no desired signal available for filtering the gradient (d,(n) = 0 in Fig. 
2.29), we have chosen the constrained Z-LMS described in [79], and (2.125) becomes: 


a(n+1)=P [a(n) ~ AZ (0) (n) (—y,(n))] +F. (2.127) 


where yg(n) = g)(n)a(n) is the output from the L-LMS filter, and the matrix P and the 
vector F are respectively given by (see [79] and the references therein): 


P=I-cC(C’c)"C?, F=C(C'C)F (2.128) 
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7 
1 0...0 |, Lodd 
bal 
-Qra ; 
C= GS eile (2.129) 
Iz 
2 0 
1 0...0 ], Leven 
-Qz 
TI and Q are the identity and the opposite identity matrices, respectively. 
1 0 0 0 Ot 
0 1 0 0 1 0 
I= , Q= 
Can 0 eee | Lag SO) 20 


In the case of the new algorithm, there are basically two adaptive algorithms: one is used 
for adaptation of the coefficients a(n), and the second one is the AOSLMS algorithm. 
Therefore, there are two step-sizes that have to be chosen for the convergence of the 
algorithm. The most sensible step-size is that of the Z-LMS filter employed for the 
gradient. If this coefficient is not appropriately chosen and the Z-LMS filter diverges, then 
all the algorithm diverges. The main problem is to find a robust condition for this step- 
size, that ensures stability of the algorithm for a wide range of gradient distributions. To 
this aim, we have employed a normalized Z-LMS and the value of a in (2.127) is replaced 
with: 


a= E (2.130) 


2 
7+ [[go(n)|| 
Finally, the new algorithm is described by formulas (2.125)-(2.130). 


Steady-state study 


The equation for updating the coefficients of the Z-LMS filter is given by (2.127). Denoting 
z(n) = E {a(n) — a} (a, are the optimal coefficients of the Z-LMS filter), and using the 
development of Frost in [31], one obtains: 


n+1 


z(n + 1) = [I—aPR(q)P] z(n) = [I- aPR()P]"" z(0) (2.131) 


where Ry,,) is the autocorrelation matrix of the ordered first row of matrix g(n) from 
(2.123). The matrix PR,,,)P determines both the speed of convergence and also the 
steady-state variance of the weighting coefficients a(n). If 0 < a < 1/Amax (Amar iS 


the maximum eigenvalue of matrix PR,,,)P), then the convergence in the mean of the 
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weighting vector a(n) is ensured. A more restrictive condition for the step-size, that 
ensures also convergence of the average MSE was given in [31]: 
1 


0O<a< ‘ 2.132 
Fman ¥ (/Dir(PR,oP) ae 


where tr{A] represents the trace of the matrix A. 
If a satisfies 
2 


2a. —— (2.133) 
3tr (Rw) 


then it is guaranteed to satisfy also (2.132) (see [41] and the referred papers). The matrix 
R go) is different for various gradient distributions and a value of a chosen based on (2.133) 
for a certain gradient distribution may not be suitable for another gradient distribution. 
Since the convergence of the algorithm has to be ensured for any distribution without an 


a priori knowledge, the step-size in the proposed algorithm is chosen as a = C 5 
7 + |lgo(n)]| 
thus, (2.133) becomes: 
2 
(a hee. (2.134) 


3 


The asymptotic convergence point of (2.126), which we denote as h,, is the point 
where E {O {g(k)}a(k)} = 0 when h(k) = hy. In the case of absence of input noise 
(w(n) = 0, f(n) = x(n)), for i*” coefficient of h(n) we would have: 


E{O {gi(k)} alk)} = FE {O {[gi0(k) --- 9i,2-1(k)]} alk) } (2.135) 


where g; ;(k) = e(k —j)f(kK —i-—j), i =0,N —1, 7 =0,L—1. The value of 9; ;(k) can 
be written as gij(k) = | fh? AT (k— i) f(k —j) tu(k— i)| FES): 
If h(k) =h for a large value of k, then (2.135) becomes: 


E{O {gi(k)}alk)} = EO {ulk) fk — t)...u(k — L) fk —i— L)} alk)} 


Since f(k) and v(k) are both zero mean, i.i.d., with symmetric distributions, and inde- 
pendent on each other, and if we assume that the weighting coefficients a(k) are constants 
for large k, then E {O {g;(k)} a(k)} = 0, Vi = 0, N —1. This result means that the co- 
efficients h, = h represent the asymptotic convergence point of (2.126) for the case with 
only output noise. 


For the case with only input noise (v(n)=0), the value of g;,;(k) in (2.135) is given by: 


gig(k) = [bTE(k — 7) — AT (k — j)x(k—3)] Fk -i- 3) 
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a fixed 
AOSLMS | L-(N+L) NALD 
a_ adaptive +2(L +41) +17+2L-1 


ND NE) 


Table 2.4: Extra computations for OSLMS filters. 


Making the same assumptions, it can be shown that the coefficients h(n) converge 
within a small ball around h, (see [32], [33], [39]). 

Thus we can conclude that the new algorithm converges within a small ball around h, 
for input and also for the output noise cases provided that the Z-LMS filter is convergent 


too. 


Computational complexity analysis 


The computational complexity of the new algorithm is increased comparing to the other 
OSLMS algorithms. For comparison purposes, in Table 2.4 we present the extra computa- 
tions needed for the standard OSLMS and the new AOSLMS algorithms. In addition with 
these computations, each algorithm needs N sorting operations performed on LI samples, 
where N is the length of the OSLMS filter and L is the length of the weighting vector. 


2.5.2 Simulation results 


The simulations were done in the system identification framework depicted in Fig. 2.28. 
The new AOSLMS algorithm was compared with the following OSLMS algorithms: MLMS, 
MxLMS, OxLMS and ALMS. The length of the filters was N = 11, the length of the 
weighting vector was L = 7. The step-sizes of the compared algorithms was chosen in 
such a way, that they would have the same convergence speed. The step-size ji has a fixed 
value chosen to satisfy (2.134). The input signal f(n) was Gaussian random sequence 
with zero mean and unity variance. The noise, either w(n) or v(n) has a generalized 


exponential density of the form: 
pr)=ke ll, |r coy 6>0 
1 ? B/2 
ky = (o4/? ar (5)) » ke= Ir (3) AG (3) (2.136) 


where T°’ is the gamma function. 
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Figure 2.30: Steady-state sum of squared Figure 2.31: Steady-state sum of squared 
coefficient errors for output noise case coefficient errors for input noise case 
w(n) =0. u(n) =0. 


As (3 in (2.136) increases, the resulting noise density varies from highly impulsive 
(3 = 0.2) to Gaussian (@ = 2) and near uniform (3 = 7). The algorithms were compared 
using the steady-state sum of squared coefficient errors: 


N 
u = 10- log 1S hi a io) | 

i=1 
For each algorithm, a number of 100 independent runs were performed and the results 
were averaged. The corresponding learning curves are given in Fig. 2.30 for the case of 
output noise (w(n) = 0) and in Fig. 2.31 for the input noise case (v(n) = 0). We can 
see from these figures that for impulsive environments (3 < 2) the new algorithm has a 
performance similar to MxLMS and MLMS and for ((@ > 2) the new AOSLMS algorithm 
performs similar to OxLMS and ALMS. 
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Chapter 3 
Transform domain implementations 


This chapter considers the transform domain implementations of two classes of adaptive 
algorithms: the Transform Domain Variable Step-Size LMS (TDVSLMS) and the Trans- 
form Domain LMS algorithm for time-varying environments. The algorithms developed 
here represent the transform domain counterparts of the algorithms from Chapter 2 and 
they are introduced to improve the behavior of their time domain counterparts. For in- 
stance, the convergence of the time domain VSSLMS algorithms is still slow for correlated 
inputs and it was found that the TDLMS can improve the convergence using the decor- 
relation of the input sequence. More than that, if the same approach of time-varying 
step-size is used in the TDLMS, its speed of convergence is increased even further as we 
will see in the sequel. The time domain adaptation of the step-size toward the optimum, 
for time-varying environments, requires, in the proposed algorithm, three adaptive fil- 
ters working in parallel as shown in Section 2.4. Doing the step-size adaptation in the 
transform domain the required number of adaptive filters reduces to two and also the con- 
vergence speed is increased when the input sequence is highly correlated. As a result, the 
transform domain implementation may be an alternative to the time domain adaptation 
of the step-size. 

In Section 3.1, the plain Transform Domain Least Mean Squared (TDLMS) algo- 
rithm [20], [25] is reviewed and its theoretical analysis is briefly described. However, the 
theoretical analysis of the TDLMS is not presented in great details since the same deriva- 
tions as those in Chapter 2, can be applied and also there are many papers in the open 
literature which study the analytical behavior of the TDLMS (see [4], [5], [20], [25], [30], 
[45], [51], [58], [61], [73] and the references therein). In the analysis of the TDLMS, we 
follow two main directions and the differences between the time domain and transform 


domain LMS algorithms are emphasized. First, the transient and steady-state behavior 
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in terms of the MSE for stationary environments is discussed and then the analysis of the 
TDLMS algorithm for tracking a time-varying system with fixed length is presented. The 
analytical expressions of the optimum step-sizes which minimize the steady-state MSE 
and the steady-state mean squared coefficient error (MSC) are obtained next. At the end 
of this section, some simulations which demonstrate the validity of the analytical results 
are presented. 

In Section 3.2, the class of Transform Domain Variable Step-size TDVSLMS algorithms 
is addressed. First, a variable step-size LMS in transform domain proposed by Kim 
in [45] is shortly presented and then, based on the analytical results outlined in the 
first section, three new TDVSLMS algorithms are introduced. However, the DCT-LMS 
algorithm is slightly different from our approach since it does not uses the output error 
to update the step-size as in our algorithms. Simulations results, showing the behavior 
of these new algorithms for the problem of system identification when the input sequence 
is highly correlated, are given in the sequel. In the simulations, we compare the behavior 
of the proposed algorithms with the plain TDLMS, DCT-LMS from [45] and the time 
domain VSSLMS algorithms described in Section 2.2 such that, the simulations shown 
here, complete the results from Section 2.2. 

Section 3.3 is dedicated to the problem of tracking time-varying systems. From the 
theoretical results shown in the first section, we directly derive an adaptive algorithm 
in which the step-size is time-varying and converges near the optimum. The difference 
between the algorithm described here and the algorithm proposed in Section 2.4 relies in 
the fact that the transform domain implementation uses just two adaptive filters. The 
new algorithm is implemented in the time-varying system identification framework and 
the simulation results are shown at the end of this section. 

At the end of the chapter, a very short presentation of the Scrambled LMS (SCLMS) 
algorithm is presented. Although, the SCLMS does not use an orthogonal transform 
at the input, we have chosen to include it here for two reasons. The first reason is 
that the SCLMS transforms the input sequence by means of a scrambling device and 
for correlated inputs this operation acts a whitening process. As a consequence, both 
SCLMS and TDLMS perform a decorrelation of the input prior to the adaptation of the 
coefficients. The second reason to include here the SCLMS is its increased interest for 
practical applications. In many applications a secure transmission of the data is necessary, 
which can be realized by scrambling the transmitted data. When adaptation must be done 
for scrambled data, the resulting algorithm is the SCLMS. 

For these two reasons, in the last section of this chapter, a comparison between the 
TDLMS and SCLMS for the problem of local echo cancellation is presented. The specific 
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application addressed is the digital data transmission over a telephone line and the com- 
parison is made in terms of convergence speed, steady-state mean coefficient error and 
steady-state MSE. 


3.1 The transform domain least mean squared algo- 


rithm 


In the previous chapter, we have discussed about the time domain Least Mean Squared 
algorithm and some of its variants. From the analysis shown in Section 2.1, we have seen 
that the MSE of the LMS can be approximated with a sum of exponentials whose time 
constants are inversely proportional with the eigenvalues of the input autocorrelation ma- 
trix R. As a consequence, if one of the eigenvalues of R is very small, the convergence of 
the adaptive filter in this direction will be slow. We can conclude that the convergence 
speed of the time domain LMS depends on the eigenvalue spread?! of the input autocorre- 
lation matrix. Some algorithms which try to improve the convergence of the LMS such as 
the VSSLMS can be implemented. However, the VSSLMS algorithms do not modify the 
input sequence and its autocorrelation matrix, therefore they are expected to have slow 
convergence for correlated inputs. One solution to this problem is to perform a decorre- 
lation of the input sequence using an orthogonal transform and the resulting algorithm is 
called the Transform Domain LMS. 

In this chapter, we consider the transform domain adaptive filter whose block diagram 
is depicted in Fig. 3.1, where T represents the orthogonal transformation applied to the 
input vector x(n) = [x(n),...,2(n —N + DJ‘, h(n) = jiu (n), = Jin(n)] is the N x 1 
vector of the adaptive filter coefficients, s(n) = [s;(n),...,sy(n)]’, 9(n), e(n) and d(n) 
are the transform coefficients, the output of the adaptive filter, the output error and 
the desired sequence respectively. Other transform domain structures such as subband 
adaptive filters [20] also exist in the literature but they are not the subject of this thesis. 


With reference to Fig. 3.1, the transform coefficients are computed as follows: 
s(n) = Tx(n), (3.1) 
and the output of the adaptive filter is obtained from: 


§(n) = h'(n)s(n) = s"(n)h(n). (3.2) 


'The eigenvalue spread is defined as the ratio between the largest and the smallest eigenvalues of R. 
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Figure 3.1: The block diagram of the transform domain adaptive FIR filter. 


with T being the N x N orthogonal matrix with real valued elements satisfying the 
following relation: 

Pee Tea (3.3) 
where ¢ represents the transposition operator and I is the N x N identity matrix. 

Other transforms? with complex valued elements can equally be used for the orthogo- 
nalization of the input sequence x(n). In this thesis the Discrete Cosine Transform is used 
because it gives real valued coefficients. However, the orthogonal transformation used in 
the implementations must be chosen based on the characteristics of the input signal x(n). 
A very good discusion on the selection of the best transform can be find in [21] and [45]. 
Since the above referenced papers provide a well documented discusion, here we do not 
address the problem of transform selection. 

The cost function used to optimize the coefficients of the transform domain adaptive 
filter is the MSE defined as: 


J(n) = E [e?(n)| where e(n) = d(n) — y(n). (3.4) 


From (3.4), the following Wiener-Hopf equation can be obtained for transform domain 
which gives the coefficients of the optimum filter (see Section 2.1 for more details): 


hr, = R;'p.. (3.5) 


where R, = E[s(n)s‘(n)] = TRT* is the autocorrelation matrix of the transformed 
coefficients s(n), R is the autocorrelation matrix of x(n), p; = Tp is the cross-correlation 
vector between s(n) and d(n) and p(n) is the cross-correlation vector between x(n) and 
d(n). 
It follows from (3.5), after some simple mathematical manipulations, that the optimum 
coefficients vector is given by: 
hr, = Th,. (3.6) 


2Called unitary transforms which satisfy T?’T = TT? =I and H is the hermitian transposition. 
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In (3.6), by hy, we denoted the optimum coefficients in transform domain, whereas h, 
represents the Wiener solution in time domain expressed by (2.9). We can conclude that 
the optimum solution in the mean squared sense in transform domain is obtained simply 
by applying the orthogonal transformation to the optimum coefficients in time domain. 
The input autocorrelation matrix which governs the convergence of the transform 
domain adaptive filter is given by Rs = TRT"”. In the ideal case, when the elements of 
s(n) are uncorrelated the matrix R, is diagonal. It follows that, when the transformation 
matrix T is applied to the input sequence the input autocorrelation matrix is diagonalized. 
However, the diagonal elements of R, are not equal and its eigenvalue spread is equal to 
the eigenvalue spread of R. A solution to this problem is to normalize R, with its diagonal 
elements diag (R;). Specifically, this normalization is applied only in the update formula 
of the adaptive filter coefficients, which for the i” filter coefficient can be written as 


follows: 
hi(n +1) =h,(n) + —“—~<5,(n)e(n). (3.7) 
€+0,,(n)? 
where o2(m) is the power estimate of the i” transform coefficient s;(m) and € is a small 


constant which eliminates the numerical instability when 07 (n) is close to zero. 
The powers of the transform coefficients s;(n) can be estimated by the following simple 


formula?: 


a2 (n) = aa? (n —1)+(1-a) s}(n), Vis TN (3.8) 
which it is proven to converge close to the diagonal elements of Rg. 


From (3.7) and (3.8) we can conclude that each coefficient of the adaptive filter is 
[lb 
Grol (hn) 
normalization term. For stationary inputs, it can be shown that, (3.8) converges fast to 


updated by a different step-size ju;(n) = which is time-varying due to the 


the real powers of the transform coefficients, therefore to simplify the analysis, one can 
2 


consider that the estimates o;, are constant. However, in some cases where other power 
estimators are used, the step-sizes are time-varying due to the normalization term. Such 
an example is the DCT-LMS algorithm suggested in [45], where other formula is used 
instead of (3.8). 

In all our implementations, we have used (3.8) to estimate the powers of the transform 
coefficients, therefore we can assume that, after few iterations, the estimates in (3.8) are 
constant and close to the diagonal terms of R,. As a consequence, in the analytical 
derivations we use 02 instead of o% (7). 


Finally the TDLMS algorithm is described by the following six steps: 


3Other approaches are also available in the open literature [45]. 
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Transform Domain Least Mean Squared algorithm: 


IAt each iteration n do: 


1. Form the input vector x(n) = [x(n),a(n—1),...,2(n — N+ 1)]' from the input 


sequence x(n); 
2. Compute the coefficients s(n): s(n) = Tx(n); 
3. Compute the output of the adaptive filter: 7(n) = s*(n)h(n) = h'(n)s(n); 
4. Compute the output error: e(n) = d(n) — y(n); 
5. Update the power estimates of the coefficients s(n) using (3.8); 


6. Update every coefficient of the adaptive filter: h(n +1)= h(n) + ete silm)e(n); 


3.1.1 Analysis of the TDLMS for stationary environments 


We point out the important formulas for the transient and steady-state MSE and mean 
coefficient error for the TDLMS, when operating in a stationary environment. These 
analytical results will be used in subsequent sections to introduced the new class of variable 
step-size algorithms in transform domain. Since most of the equations derived for the 
time domain case in Section 2.1 can be applied here with small modifications, we will 
not give the entire derivations but we will emphasize the differences between the time 
domain approach and its transform domain counterpart. To make the analytical analysis 
mathematically tractable, similar assumptions as in time domain are made (see Section 
DADY, 
We start with the coefficient error vector defined as: 


Ah(n) = h(n) — hy, (3.9) 


where hr, is the optimum solution defined by (3.6). 


The update equation (3.7) can be written in a more compact form as follows: 
h(n +1) = h(n) + p0~'s(n)e(n) (3.10) 


where Tis a diagonal matrix composed by the terms € + 0%,(n). 
Subtracting hy, from (3.10) and taking the expectation operator, after some simple 


mathematical manipulations we obtain: 


E[Ah(n + 1)] = (I- wE [["] R,) E [Ah(n)}, (3.11) 
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where we have used the assumption that h(n+1) is independent on s(n) and the optimum 
error defined as e,(n) = d(n) — hy,s(n) is orthogonal to the input vector s(n)*. 

Assuming that the diagonal elements of I are constant and equal to the powers of the 
elements of s(n), equation (3.11) can be simplified to: 


Ah(n + 1) = (1 — pp)” Ah(0). (3.12) 
From (3.12), the condition for the convergence of the coefficients in the mean is: 
De a, (3.13) 


Comparing (3.13) with its counterpart from time domain (equation (2.19) from Sec- 
tion 2.1.1) we can see that in the case of TDLMS the condition for the convergence of the 
adaptive filter coefficients does not depend in the eigenvalues of the input autocorrelation 
matrix while in time domain the condition depend on Amar. Intuitively, we can arrive at 
the same conclusion if we take into account the fact that the autocorrelation matrix is 
diagonalized by the orthogonal transform and the power normalization makes the auto- 
correlation matrix equal to identity®. As a result, the eigenvalues of the autocorrelation 
matrix are all equal to unity. 

To obtain an analytical expression for the cross-correlation matrix of the coefficient 
error vector we subtract hz, from (3.10). Pre-multiplying the result by its transpose and 
taking the statistical expectation we arrive at the following expression: 


C(n +1) = C(n) — C(n)RP' — pl 'R,C(n) + 2u°P'R,C(n)RP* + 
wl 'R.IT ltr [R,C(n)] + Jewurk “Re (3.14) 


where C(n) = EF {Ah(n)Ah‘(n)}. 
Making the assumption that [~'R, ~ I, equation (3.14) can be simplified as follows: 


C(n + 1) = C(n) — wC(n) — wC(n) + 2u?C(n) + w?T tr [R,C(n)] + Jminut?P (3.15) 


Following a development similar to the one in time domain, and starting from (3.4), 


the following expression can be obtained to describe the MSE: 


J(n) = Jmin + tr [R,C(n)] (3.16) 


“The orthogonality of the optimum error to the input vector s(n) can be proven in a similar manner 


as in time domain from the fact that the gradient, when h(n) approaches hy, equal zero. 
5 Actually it is close to identity. The only transform which makes the autocorrelation matrix equal to 


the identity matrix is the KLT. However, the DCT was shown to be a good approximation of the KLT 


for many practical signals. 
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where Jmin = E {d(n) — hi, s(n)} is the minimum mean squared error obtained in the case 
of perfect adaptation and the term tr[R,C(n)] appears due to the imperfect adaptation 
of the coefficients. 

The second term in (3.16) can be obtained by pre-multiplication of (3.15) with R, 
and taking the trace of the result. At the steady-state we have C(n +1) © C(n) and the 
MSE can be expressed as follows: 


EN 
feel rem le : 3.17 
‘ ( = aa) eno 


which, for small values of the step-size pu, is simplified to: 


N 


Intuitively (3.18) and (2.24) are equivalent, since in the TDLMS the normalization 
by the powers of the transform coefficients is used, therefore the matrix which gives the 
misadjustment of the algorithm is approximated by the identity matrix. 

The condition for the convergence of the MSE can be obtained from (3.17) forcing the 
misadjustment to be bounded. The value of the step-size for which the misadjustment 
tends to infinity is Umar = Nod? therefore the step-size should satisfy the following 
stability condition: 


0O<p< 


, 3.19 
N+2 ( ) 


3.1.2 Optimum step-size for time-varying environments 


In this section, the behavior of the transform domain LMS algorithm for the problem of 
a time-varying system identification depicted in Fig. 3.2 is addressed. To this end, we 
derive the analytical equations which describe the steady-state MSE and the steady-state 
mean squared coefficient error in a similar manner as in Chapter 2. Due to the fact that 
many derivations are similar with those in the previous chapter, here we emphasize the 
differences between time domain and transform domain implementations. The same setup 


is followed as in time domain, where the unknown system is modeled as follows: 


h(n + 1) = h(n) + €(n), (3.20) 


with e(n) being the vector of the increments of the unknown system, at time instant n. 
We consider the case when the unknown filter h(n) and the adaptive filter h(n) in 
Fig. 3.2 have the same length N. The notations in Fig. 3.2 are the following: x(n) is 
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Adaptive Filter 
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Figure 3.2: The block diagram of the transform domain adaptive FIR filter implemented 


for time-varying system identification. 


the input sequence, the block denoted by T represents the transform layer which trans- 
forms the input vector x(n) = [x(n), x(n —1),...,2(n — N +1)]’ into s(n) = Tx(n) = 
[si(n), S2(n),..., n(n)", v(n) is the output noise, 7(n), y(n) and e(n) are the output of 
the adaptive filter, the output of h(n) and the output error respectively. 

To make the theoretical analysis more tractable, we make the following assumptions 
which are commonly used in the open literature [20], [21], [28], [41]: 

1. The sequences x(n), u(n) and e(n) are statistically independent of one another; 

2. The sequences x(n) and v(n) are zero mean, stationary, jointly normal and with 
finite moments; 

3. The successive increments of the channel tap weights e(n) are independent of one 
another. However, the elements of e(n) for a given n, may be statistically dependent. 
The sequence €(n) is zero mean and stationary with a constant covariance matrit Q = 
E {e(n)e(n)}; 

4. At time n, the vector h(n) is statistically independent of v(n) and X(n). This 
assumption is true when pw is small [20], [41]. 

The Wiener-Hopf equations in transform domain can be written in a similar manner 
as in time domain by taking the derivative of the gradient with respect to the adaptive 
filters coefficients, equal to zero. Taking into account that the input of the adaptive filter 
is now s(n) = Tx(n), the desired sequence is given by d(n) = h‘(n)x(n)+v(n) and taking 


into account the above assumptions, the optimum Wiener solution is given by: 


hz,(n) = Th(n). (3.21) 
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It follows from (3.21), that the optimum coefficients vector at time instant n equal the 
transformed coefficients vector of the unknown filter at the same time instant n. 
Subtracting hr,(n) = Th(n + 1) from both sides of (3.10) one obtains: 


Ah(n + 1) = h(n +1) — Th(n +1) = h(n) — Th(n +1) + pe(n)F!s(n), (3.22) 
Using e(n) = h'(n)x(n) — h'(n)s(n) + v(n), equation (3.22) can be written as follows: 


Ah(n +1) = h(n) — Th(n +1) + pT 8(n) (x!(n)h(n) = s'(n)h(n)) + pT-s(n)v(n), 
(3.23) 
Making the notation er(n) = Th(n+ 1) — Th(n) = Te(n) and using the above 
assumptions and the fact that s‘(n) = (Tx(n))' = x*(n)T*, the above equation can be 


written as follows: 
Ah(n + 1) = Ah(n) — er(n) — pT~'s(n)s“(n)Ah(n) + pT~!s(n)v(n), (3.24) 


The mean coefficient error vector can be obtained taking the expectation operator on 
both sides of (3.24), and we obtain: 


E {Ah(n 4: yh =E {Ah(n) }-E {ep(n)}—pE {P-'s(n)s'(n)Ab(n) +E {L-!s(n)u(n)} , 
(3.25) 
Using the Assumptions 2, 3 and 4, one obtains®: 


E { Ah(n + y)} = (I-p0-'R,)""' B { Anco) } (3.26) 


where I is the N x N identity matrix, Ah(0) = h(0) — hy, is the initial coefficient error 
vector and R, = E {s(n)s‘(n)} is the autocorrelation matrix of the transform coefficients. 

Equation (3.26) can be simplified further if we take into account that the matrix R, 
is close to diagonal and has on the main diagonal the powers of the transform coefficients 
such that the following approximation can be made: 


R,.'=T'R, x1 (3:27) 
Using (3.27) in (3.26) one obtains: 
E {Ah(n = i} =(1—p)""'E {Ah(o)} (3.28) 
The convergence of the coefficients in the mean is ensured if: 


-l<1l-p<1, => 0<p<2, (3.29) 


6 Assumption 3 also implies that e7(n) is zero mean and stationary. 
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We should emphasize here that (3.28) proves the convergence of the adaptive filter coeffi- 
cients to Th(n) in view of the Assumption 3 which stipulates that the sequences € and er 
are zero mean. If the increments of the unknown filter are not zero mean, the coefficients 
of the adaptive filter will converge to a biased solution due to the non-zero term E {er(n)} 
which does not vanish in (3.25). 

Our main goal here is to derive the analytical equations for the steady-state MSE 
in the case of time-varying environment, which will be used further to introduce a new 
algorithm. To this end, we first derive the weight-error correlation matrix defined as 
C(n) = E { Ah(n)Ah(n) }. Pre-multiplying (3.24) by its transpose and following a 
procedure similar to the one in [21], we obtain: 


C(n +1) = C(n) — pI *R,C(n) + Q, — C(n)R,P' + pP'R,P ptr [R,C(n)] + 
42ulR,C(n)R.o tp t+ o2uT Ro! (3.30) 


where Q, = FE {e,(n)et(n)}. 
Using the approximation from (3.27) in (3.30) one obtains: 


C(n 4+ 1) = C(n) — QuC(n) + 2u?C(n) + wT ltr [R,C(n)] + 02’ *+Q,, (3.31) 


At the steady-state, for small values of the step-size ju, the third and the fourth terms 
in (3.31), can be neglected’. Also for n — oo, we have C(n + 1) © C(n) and the steady- 
state value of the mean squared coefficient error, defined as O(n + 1) = tr [C(n + 1)] can 


be written in the following manner: 
1 
On = ; [uoztr [[~*] + uo “tr [Q.]] . (3.32) 


The output MSE defined by J(n) = E {e7(n)}, is obtained, after some mathematical 


manipulations as follows: 
J(n) = Jmin + tr [R,C(n)] = o? + tr [R,C(n)] (3.33) 


where Jmin = E [d(n) — h,(n)s(n)] is the minimum mean squared error. 
Multiplying (3.30) by R, and taking the trace, the analytical expression of the steady- 
state MSE can be obtained: 
1 


Je = 62 4 2 p(N +2) [uolN + ptr [R,Qr]] (3.34) 


"This was justified in [22]. 
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For small values of the step-size a simplified analytical result can be obtained which is 
more useful for practical implementations: 


1 
Je = 02+ 5 [uo2N + pw ‘tr [RQ] (3.35) 


From (3.32) and (3.35), we can see that O,; and J, are nonlinear functions on the 
step-size . Both analytical results show that the steady-state values O,, and J, contain 
two components. The first component is proportional with the step-size and it is due 
to the noisy estimates of the adaptive filters coefficients. The second component that is 
inversely proportional with ~ appears due to the time variations in the unknown filter 
coefficients. Due to this fact both O,, and J, possess a minimum for a certain value 
of the step-size. The optimum step-size 2 which minimizes ©,, and the step-size p™* 


oO 


which minimizes J,; are not equal in general and they can be expressed by the following 


equations: 
tr [R;,Qs] (3) tr [Q5] 
So d = ,/ ————_ 3.36 
° oN TO Mo ~ Ml 32tp (P-T] oe) 


Although (3.32) and (3.35) were obtained making some assumptions and approxima- 
tions, we found that they provide good practical models for the behavior of O,; and Jet. 
We should emphasize here that these results were obtained for the case when the desired 
sequence is obtained as the output of a time-varying FIR filter and the input sequence 
x(n) is stationary®. 

The reason to introduce this theoretical analysis was to have the basis for the intro- 


duction of a new algorithm with adaptive step-size for time-varying environments. We 


mse 
oO 


are interested to develop an algorithm in which the step-size is updated toward pi™”°°, such 
that the steady-state MSE J,; is minimized. To this end, we analyze (3.35) and we make 


the notation A = tr |R,Q,]. With this notation, equation (3.35) can be written as follows: 


+ -o, Nut —A. (3.37) 


2 


Vv 


In the above equation there are just two unknown quantities 0¢ and A as opposed 
with the time domain counterpart (2.79) where the unknowns are o?, tr [R] and tr [Q]. 
As a consequence, in order to derive an algorithm for step-size adaptation which does 
not need the knowledge of the statistics of h(n), only these two values must be estimated 


during the adaptation. For the transform domain, a system of two equations must be 


8When the input 2(n) is non-stationary the input autocorrelation matrix R and its transform domain 


counterpart R, are both non-stationary. Analysis of this case is left beyond the scope of this thesis. 
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solved in order to compute p”**, therefore just two adaptive filters working in parallel are 
necessary”. 

The reduction of the number of adaptive filters was the main reason to implement 
the adaptive step-size in the transform domain. However, the reduction in computational 
complexity is shadowed by the fact that the input vector x(n) must be transformed to 
s(n) by means of an orthogonal transformation T, which introduces some extra compu- 
tations. The second reason of equal interest, to introduce the adaptation of the step-size 
in transform domain, was to increase the convergence speed for highly correlated input 
sequences. In practice the user must choose one of the two alternatives which is more 
suitable for the application at hand. If the time domain implementation provides enough 
convergence speed (i.e. for small filter lengths), then it might be a good choice. Otherwise, 


the transform domain is the alternative which ensures an increased speed of convergence. 


3.1.3. Simulations and results 


In this section, we show the simulations results conducted with the aim to verify the 
analytical results from the previous section (we verify the validity of (3.32), (3.35) and 
(3.36)). To this end, we have implemented an adaptive FIR filter in a time-varying 
environment as depicted in Fig. 3.2. The model of the time-varying coefficients of the 
unknown system is expressed by (3.20), where the elements of the vector e(n) are chosen 
to be random independent zero mean Gaussian-distributed sequences. The variances of 
the elements of €(n) were all equal to 0? = 10-8. The lengths of h(n) and h(n) were 
equal to N = 4. The output noise v(n) was a random Gaussian-distributed sequence with 
zero mean and variance a? = 25 x 10-4. The algorithm used to update the coefficients of 
the adaptive filter was the transform domain LMS and the orthogonal transformation T 
was the Discrete Cosine Transform. The coefficients of the adaptive filter were updated 
at each iteration using equation (3.10) and the elements of the diagonal matrix '~! were 
iteratively estimated by (3.8). The coefficient a in (3.8) was chosen to be equal to 0.9 
which shows a good trade-off between the accuracy of estimation and the convergence 
of the diagonal elements of [. The model used to generate the input sequence was the 
Markov(1) model: 

x(n +1) = Bu(n) + n(n). (3.38) 


where 3 = 0.75 and 7(n) was a random zero mean Gaussian-distributed sequence with 


the variance chosen, such that the variance of x(n) was unity. 


°In time domain due to the fact that J,, in (2.79) was parametrized with three parameters A = o?, 


oytr [R] [Q] 
B= 


t 
and C = =, three adaptive filters were needed. 
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Figure 3.3: Steady-state excess MSE vs. Figure 3.4: Steady-state mean squared co- 
the step-size for a time-varying system efficient error vs. the step-size for a time- 


identification in transform domain: exper- varying system identification in transform 
imental results (continuous line) and ana- — domain: experimental results (continuous 
lytical results (dashed line) line) and analytical results (dashed line) 


With this system setup, the optimum step-sizes which minimizes the steady-state MSE 
and the steady-state mean squared coefficient error were found to be: 


urse — 0.02 and p? = 0.0118. (3.39) 


mse 
oO 


where we have used (3.36) to compute pi”** and p29. 

To obtain experimentally the dependence between J,; and the step-size and also the 
dependence between O,; and ju, we have conducted a set of different simulations. During 
one simulation, the step-size of the TDLMS was constant. However, the step-sizes used in 
different simulations were not equal. We start with 4 = 5 x 107° for the first simulation 
and continue until 44 = 0.05 for the last simulation. The increment of the step-size between 
two experiments was 107%. All the experiments contained a number of 100 independent 
runs of 5 x 10° iterations and the results were averaged. In order to have a more clear 
representation, instead of plotting the MSE, we have chosen to plot the excess MSE 
as a function of the step-size. Therefore, in every experiment we have computed the 
steady-state excess MSE and the steady-state mean squared coefficient error and they are 
plotted versus the corresponding step-size in Fig. 3.3 and Fig. 3.4 respectively. In Fig. 
3.4 and Fig. 3.3 we have also plotted the value of 0, obtained by estimation of (3.32) 


and the steady-state excess MSE obtained from the evaluation of the following analytical 
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expression: 


1 
Jee = . {po2N + p~'tr [R,Q,] } (3.40) 


From Fig. 3.3 and Fig. 3.4 we can see that there are small differences between the 
experimental and analytical results. These differences are due to the fact that some 
assumptions and approximations were done in the derivation of (3.32) and (3.35). For 
instance, to obtain (3.32) we have assumed that the third and fourth terms in (3.31) can be 
neglected. Moreover, the term 2— (NV + 2) was approximated with 2 in order to simplify 
both (3.32) and (3.35). Anyway, these differences between the theory and practice are 
small and the important issue is that in the experimental results and analytical results 
the optimum step-size is preserved (for both steady-state MSE and ©,; the analytical and 
experimental curves have minimum around the same point corresponding to the optimum 
step-size). 

In conclusion, we can state that (3.35) represents a good basis for the derivation of 
step-size adaptation as we will see in Section 3.3. 


3.2. Transform domain variable step-size LMS algo- 


rithms 


In this section we address the problem of step-size adaptation for transform domain LMS 
algorithm. We note that our discussion here is emphasized for the stationary environments 
whereas implementations for time-varying environments are addressed in the next section. 
It is well known that, for highly correlated input signals the speed of convergence of the 
time domain LMS algorithm degrades dramatically. As an alternative solution, different 
modifications of the LMS algorithms with variable step-size as well as transform domain 
LMS (TDLMS) algorithms have been developed in the open literature (see, e.g., {1], [4], 
[7], [9], [11], (18), [19], [20], [25], [27], [29], [30], [38], [58], [61], [62)). 

As we have seen in Section 3.1, in the case of TDLMS an input signal is transformed 
using an orthogonal transform and the filter coefficients are updated by independent step- 
sizes as shown in (3.7). In the existing approaches, the step-sizes are often considered 
time-varying due to the power estimates of the transform coefficients’? in (3.7). When 
the power estimates become constants, the different step-sizes corresponding to each filter 
coefficients are also constants. As a consequence, when the input signal is stationary, the 


10Some authors consider the step-size corresponding to the i*” adaptive filter coefficient in (3.7) to be 


s(n) = a and it is time-varying due to the normalization with o? (n). 
e+ on) 
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step-size of each filter tap is time-varying just during the early stages of the adaptation 
and after that it is constant. However, there are no TDLMS algorithms known so far that 
uses in the update of the step-size the output error. Here we introduce some modifications 
of the TDLMS algorithm which have the following feature: we define for each step-size a 
local component depending on the power normalization and a global component that is 
the same for each filter tap. As opposed to the existing approaches, the global component 
is also time-variable, and depends on the output error, such that the speed of convergence 
of the new algorithms is significantly improved. 

In our discussion first we briefly describe the existing approaches in the open literature 


and then three new transform domain algorithms with variable step-size are introduced. 


3.2.1 Existing approaches 


When the TDLMS is used to update the coefficients of an adaptive filter, equation (3.7) 
is usually implemented and the power estimates are obtained in various ways. One way is 
to use the average method of (3.8), whereas in other publications some other methods are 
proposed, such as the Gram-Schmidt normalization in [60] or other averaging method as in 
[45]. Also, analysis of different orthogonal transforms for various types of input sequences 
are published [4], [21] and many performance indexes which express their decorelation 
properties are defined. It is well known that the optimum transform is the KLT which 
diagonalizes the autocorrelation matrix R,, but it requires the knowledge of the input 
signal statistics in order to be implemented. Other transforms, such as the DCT are 
shown to be close to the KLT for different signal distributions. 

A very interesting approach was proposed by Kim and Wilde in [45], where the co- 
efficients of the adaptive filter have different time-varying step-sizes. In the case of the 
DCT-LMS introduced in [45], the step-sizes are changed based on the following formula: 


pln +1) = Buu(n) +9 = 8) (> a or (3.41) 
where s;(n) = [s;(n), s;(n — 1), ...,8;(n — M+ 1)]" is the vector of the past M values of 
the i‘ transform coefficient s;(n), 3 € [0,1], y € [0,1], and 0 < « < 1 are some constant 
parameters. 

In [45], the theoretical analysis of the DCT-LMS was provided together with the 
simulations showing the performances of the proposed algorithm for system identification 
and channel equalization applications. It was shown that the DCT-LMS performs better in 
terms of convergence speed than the TDLMS which uses (3.7) and (3.8) for the coefficients 
adaptation. The theoretical analysis for the DCT-LMS was done in a similar manner with 
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that of the time domain LMS and the value of the steady-state misadjustment was found 
to be: 


La 
E lu? 
where y; = Petey with EF [;(00)] and E [?(co)] being the steady-state mean and 


respectively mean squared value of the step-size. The analytical expression for the mean 
and mean squared value of the step-size were also given in [45]. 

Since the DCT-LMS was the only transform domain algorithm with variable step-size, 
which we have found in the open literature, for benchmark purposes, our new algorithms 
are compared with it. However, as it can be seen in (3.41), the step-size of every coefficient 
of the DCT-LMS is not updated based on the evolution of the output error, therefore one 
can include this algorithm in the class of algorithms which uses an improved normalization 
method. 


3.2.2 Transform domain LMS adaptive filter with variable step- 


size 


Here, we propose a new algorithm that uses the output error in order to update the step- 
size of each filter tap resulting in a significant improvement of the convergence speed. To 
develop our new algorithm we start from the well known method, in which the step-size 
of the i*” coefficient is computed as follows: 


Lb 


aoe (3.43) 


Li(n) = 


In (3.43), the numerator ys can be viewed as the global component of the step-size, since 
it is the same for each coefficient and the denominator can be viewed as the local compo- 
nent of the step-size, and it depends on the power estimate oe (n) of the corresponding 
transform coefficient. 

In the approaches proposed so far, only the local component is variable whereas in 
our new algorithm also the global component is time-varying and depends on the output 


error as follows: 


w(n) =ap(n) +t So 2) (3.44) 
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In the case of the new TDVSLMS algorithm, the update of each filter coefficient is 
given by: 


hi(n +1) = hj(n) + eens. (3.46) 
where the notations are the same as for the standard TDLMS algorithm, p(n) is given by 
(3.44) and (3.45) and we have used (3.8) for power estimation. 

The behavior of the new TDVSLMS algorithm can be described as follows: for a 
number of L consecutive iterations (the test interval of length L), the global component 
(nr) is constant, and the new algorithm behaves as a standard TDLMS. At the end of 
the test interval, the average of the past L squared values of the error is computed, and 
(2) is updated according to (3.44) and (3.45). In this way, when the output error is 
large the step-size is increased, such that the convergence time is shortened. When the 
adaptive filter goes toward the steady-state, the error decreases which decrease also the 
global component of every step-size. In all our simulations, we have used L = 10 which 
shows good performances. Usually, the parameter y in (3.44) has a small value, and it 


may be chosen to meet the misadjustment requirements. 


The steady-state mean squared error analysis of the new TDVSLMS algorithm 


The steady-state analysis of the TDVSLMS algorithms is done in order to find the relation 
between the misadjustment of the algorithm and its parameters. Based on the analytical 
expression of the steady-state missadjustment, we will discuss how to set the parameters 
of the algorithm in order to obtain the desired performances in terms of convergence speed 
and small steady-state error. To start the analysis, we first rewrite (3.10) in the following 
form: 

h(n +1) = h(n) + fi(n)s(n)e(n), (3.47) 
where fi(n) = u(n)T~!(n) is an N x N diagonal matrix with the diagonal elements given 
by fii(n) = p(n)TZ*(n) and p(n) is given by (3.45). 

To make the convergence analysis of the TDVSLMS algorithm more tractable, besides 


the usual assumptions, we introduce the following ones: 
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Assumption 1: After few iterations the power estimates of the transform coef- 


ficients s;(n) become constant therefore, the step-sizes /i;(n) are independent to 
s(n). 
Assumption 2: The step-sizes /i;(m) in (3.47) and the output error e(n) are inde- 
pendent. This can be justified by the update method of the step-size which is done 
using just some past L values of the error as shown in (3.45). 
The mean squared error (MSE) is defined by (3.16) which we rewrite here for convenience: 
JI(n) = Jmin + tr [R,C(n)] , (3.48) 


where Jjin is the minimum MSE obtained in the case of perfect adaptation, C(n) is the 
covariance matrix of the coefficient error vector, R, is the autocorrelation matrix of the 
transform coefficients. 

The input autocorrelation matrix can be expressed as Rs = QA,Q', where A, = 
diag |Ao,..-; An—1] is the diagonal matrix having on the main diagonal the eigenvalues of 
R,, Q is the modal matrix of R,, QQ’ = I and Q7! = Qt. Denoting C’(n) = QC(n)Q‘, 
(3.48) can be rewritten as follows: 


E {e?(n)} = Imin + tr [AssC’(n)] = Jmin + y ric, (n) (3.49) 


where ci,(n) are the diagonal elements of C’(n). 

The coefficient error vector is defined as the difference between the adaptive filter 
coefficients and the optimum Wiener solution and it can be expressed as in the following 
equation: 

h(n + 1) — hz, = h(n) — hy, + fi(n)s(n)e(n) = [I — fu(n)s'(n)] Ah(n) + fi(nr)eo(n)s(n) 
(3.50) 

Computing the outer product of (3.50) by itself, taking the expectations on both sides 
and using the fact that C’(n) was obtained from the relation C’(n) = QC(n)Q*, one 
obtains: 

C'(n +1) = O(n) — B {iln)} [CMa + AasC'(n)] + 2 {R2(0)} [nines + 
2MooC!(n)Ass + tr [AysC'(n)] Ass] (3.51) 


Thus the diagonal elements cj;(n) of the matrix C’(n) are obtained from (3.51) as follows: 


ca(n + 1) = [1 — 2B {iii(n)} Ai] G(r) + 2E {fj} Apeu(n) + E {iG ()} Train + 


+E {72(n)} S> doen) (3.52) 
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The mean value of the variable step-size /i;(n) is given by: 


B aln)} = E {unl i Mo) } = Se. (3.53) 


with FE {(n)} given by: 


E{p'(n)}, it w= = L2b,e% cand. p(n) © (tains temax) 
_ Lmins if n-l= L, 2L, ... and p(n) < Lmin 
us vee) 7 Umax) if n-1l= L, 20, ae and y(n) > Umax 
E{pu(n—1)}, otherwise 
(3.54) 


Taking the expectation operator in (3.44) the mean value of the step-size F {y’(n)} can 
be computed as follows: 
= 
Y 
E{u'(n)} = aE (nln E 4 ye Hh = aetna 4d dS A(R). 
k=n—L 
(3.55) 
where F {e?(k)} = J(k) is the MSE at time instant k. 
The mean squared value of the step-size ji;(n) corresponding to the i” coefficient, is 
given by: 
. E {ww (n)} 
Efi? = —_—— 3.56 
VO} = Ferny} oy 


and the numerator in (3.56) can be expressed as follows: 


E {u?(n)} if wol=]2,2L, and y(n) € |pininy Heel 


2 . / 
‘ ee ee if n-1=L,20, and p'(n) < Umin 
- tH (n)} Foil \pteeae f wel=L,2L, and. pn) > pms oy 


E {p?(n — 1)} otherwise 


The mean squared value of ju'(n) is obtained from (3.44) as follows: 


2 
2By n-1 n—-1 
Ef{p?(n)} = BB {y2(n— DD ee {ato — 1) Ss" € ‘ited ( me e?(k) 
k=n—-L 
(3.58) 
n-1 2 
For small values of y the term LE ( pS ek) have negligible values at the 
k=n—-L 


steady-state and it can be discarded from (3.58). More than that, if we use Assumption 
2, the following expression is obtained to express the mean squared value of u’(n): 


E {u?(co)} = GE {p"(co)} + 267E {u(00)} Jet (3.59) 
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To obtain the steady-state MSE we need to compute the mean and mean squared 
values of the step-sizes ju;,, at the steady-state. Combining (3.53) with (3.54) and (3.55), 
and combining (3.56) with (3.57) and (3.58), the mean and mean squared values of [i;,, 
are obtained as follows: 


E{u'(oo)} — West 33 OO) 20 J, 
rT Rasy t= = a Bw) 
(3.60) 


E {tine } = 


We note that, in the derivation of (3.60) we have assumed that the step-size is between 
Hmin ANd [max The steady-state misadjustment can be obtained from (3.49), (3.52) and 
(3.60) following a derivation similar to the one in [48] and [45]: 


J st fra iin 1 Sea BY Ist — Vi 


Since the matrix R, is near diagonal and the power estimates I; are close to the real 
powers, then the summation in (3.61) can be approximated by N (due to the fact that 
I; = \;). Thus finally, the steady-state misadjustment can be written as follows: 


BONE 
M aa 1-82 mm 
1- BAS 


1—B2 min 


(3.62) 


which is similar to the results derived in [48]. 
We note that for a constant step-size, say u(n) = ju, equation (3.62) can be simplified 
to: 


1 
M = SuN (3.63) 


that is the well known approximation for the misadjustment of the TDLMS with fixed 
step-size. 

The value of the parameter L in (3.45) is not critical for the algorithm. Actually we 
have seen in our simulations that L has to be smaller than the convergence time of the 
algorithm in order to have enough step-size updates. Any values L < 10 seems to be good 
choices for a wide range of applications. 

For the parameter @ we have found that a value @ = 0.9 shows good performances in 
all our simulations. Setting 6 = 0.9, the value of the parameter y can be obtained from 
(3.62), such that, a desired level of the misadjustment is obtained. 
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Figure 3.5: The block diagram of the transform domain complementary pair LMS with 


variable step-size. 


3.2.3. The transform domain complementary pair variable step- 
size LMS algorithm 


In the development of the transform domain complementary pair LMS algorithm with 
variable step-size (TDCPVSLMS), we consider the equation (3.43) which describes the 
independent step-size computation. Again jz in (3.43) is viewed as an overall component 
of ju;(n) since it is the same for each filter tap, whereas y+0?(n) is the local component of 
ju;(7) (it is different for each filter tap). In the new implementation, the overall component 
is also time-variable as in the algorithm introduced in the previous section. However, 
the expression used to update p(n) is different. Actually, the TDCPVSLMS algorithm, 
depicted in Fig. 3.5, is the transform domain implementation of the CP-VSLMS from 
Section 2.2. In Fig. 3.5, the block denoted by T represents the transform applied to the 
input signal x(n), and s(n) is the vector of the transform coefficients. As in the case of 
CP-VSLMS, there are two adaptive algorithms that work in parallel. The Speed mode 
filter hi(n), and the Accuracy mode filter ho(n). The adaptive filter filter h, (n) is an 
additional filter which uses a large and fixed overall component ji; and is implemented 
just to increase the convergence speed of the algorithm, whereas h» (n) represents the filter 
of interest. 
The TDCPVSLMS algorithm is described as follows: 


e Compute the outputs and the output errors of the two adaptive filters: 
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Ti(n) = hj(n)s(n), Y2(n) = hg(n)s(n), e1(m) = d(n) —Hi(n), e2(n) = d(n) — ¥(n). 
(3.64) 
e Update the coefficients of the speed mode filter: 
hi(n +1) = hy(n) + p01 (n) s(n). (3.65) 
e Update the coefficients of the accuracy mode filter: 
a ‘ _ M \ 
ee oe hi(n + 1), se if ee ae and [f;_, Q(t) =1 
ho(n) + fo(n)P~ (n)eo(n)s(n), otherwise 
(3.66) 
e Update the overall component of the step-size for the accuracy mode filter: 
M 
nna if n=L,2L,..., and [[Q()=1 
i=l 
_ M 
Healer) — max ajpie(n); Miminty I SB 2b oy and’ [LQOG)=0 Coe, 
w=l1 
flo(n), otherwise 


The matrix [ in (3.65) and (3.66) is a diagonal matrix whose elements are iteratively 
computed using (3.8). The value of Q in (3.66) and (3.67) is computed as: 


n—(i-1)M n—(i-1)M 


o-f) # F aa)> Fat), es 


k=n—iM k=n—iM 
0 otherwise 

According to the above equations, the behavior of the new algorithm can be described 
as follows: for L consecutive iterations, which represents the test interval, the value of 
the overall component j12(n) is constant and the adaptive filters hi(n) and hy (n) perform 
independently as plain TDLMS filters. Also on this interval, the local averages of e?(n) 
and e3(n) are computed. If the local average of e3(n) is larger than the local average of 
e?(n) for M consecutive test intervals then the coefficients hy(n) are updated with the 
values hy (n). This is due to the fact that the filter hi(n) performs faster than hy (n) due 
to the larger value of j4;. When the speed mode filter performs better (in terms of output 
mean squared error) than the accuracy mode filter, it also means that a larger step-size 
would be more beneficial to be used. This is the reason why the value of j2(m) is increased 
in order to get a faster convergence. When the two algorithms approximate the steady- 
state, the local average of e3(n) becomes smaller than the local average of e?(n) and the 
value of jl2(n) is decreased in order to obtain the desired steady-state missadjustment". 


"The speed mode filter reaches its steady-state faster than the accuracy mode filter due to the fact 
that j2(n) < 4 at each iteration. 


94 Transform domain implementations 


Unknown 


system h 


Adaptive Filter 
h(n) 


Figure 3.6: The block diagram of the transform domain noise constrained LMS with 


variable step-size, for system identification. 


The maximum and minimum values of j42(n) are 1, and min respectively. At the steady- 
state the value of the overall component of the step-size f2(n) is constant and equal to 
Lemin, AS a consequence the steady-state misadjustment of the new algorithm equals the 
misadjustment of a plain TDLMS algorithm with step-size [imin. 

The size of parameter L (the length of the test interval) in (3.68) should be sufficiently 
large L >> 1, such that the statistical average of e?(n) and e3(n) can be obtained. Also, 
L should be much smaller than the length P of the training input x(n), such that, a 
sufficient number of re-initializations will be possible. The parameter M in (3.66) and 
(3.67), must be chosen according to the training input length and the noise level. The 
total comparison length M x L should be much smaller than P in order to ensure a prompt 
update of the variable step and filter coefficients. The value of M has to be also large in 
order to avoid the mistaken re-initializations due to the noise level. In our simulations we 


have used L = 10, and M = 3 and we have obtained good results. 


3.2.4 Noise constrained LMS algorithm in transform domain 


As in the time domain implementation from Chapter 2, the main reason to introduce the 
Transform Domain Noise Constrained Variable Step-size LMS (TDNCVSLMS) algorithm 
was to reduce the complexity of the algorithm from the previous section which uses two 
adaptive filters working in parallel to increase the convergence speed. The algorithm 
from Section 3.2.2 uses just one adaptive filter and its computational complexity is much 


lower than the computational complexity of the TDCPVSLMS. However, the setup of its 
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parameters is more complicated since in the analytical expression of the misadjustment 
(3.62), the value of the minimum MSE is included. Here, we propose a new algorithm 
that uses some information about the noise variance a? for updating the step-size in order 
to increase the convergence speed!”. The block diagram of the new algorithm for system 
identification is shown in Fig. 3.6. The algorithm is described by the same steps as in 
the case of TDLMS, the only difference is that the global component of the step-size is 
time-varying. 
The coefficients of the adaptive filter are updated by the following equation: 


h(n +1) = h(n) + p(n)P~'s(n)e(n) (3.69) 


and the step-size is changed according to: 


1 
bas ety if 7 SS e(i)? >C(n) and n=L,2L,... 
i=n—-L+1 
1)= 1 m 
ee mar{Umin, au(n)} if — S- e(i)? <C(n) and n=L,2L,... 
i=n—-L+1 
p(n) otherwise 


(3.70) 
The behavior of the new TDNCVSLMS algorithms is as follows: for L consecutive it- 
erations, the global component p(n) of the step-size is kept constant and the algorithm 
performs as a standard TDLMS algorithm with fixed step-size. With this constant value 
of y(n) the algorithm would have a certain steady-state MSE denoted by C(n). At the 
end of the test interval (after L iterations), the average of the squared error is computed. 
If the average of the squared error is larger than C'(n), then the step-size ju(m) is increased. 
This means that the algorithm is far from the intermediate steady-state and in order to 
speed-up the convergence, the global component of the step-size has to be increased. Oth- 
erwise, when the algorithm reaches the intermediate steady-state within the test interval 
of length L, the step-size is decreased in order to obtain a desired steady-state level. 
As we can see, the value of Cn) is changed each time when the step-size is modified 
and represents the MSE obtained if the algorithm would have a constant step-size until 
convergence. 
The minimum value allowed for the global component ju(7) is [min and at the steady- 
state, the algorithm performs as a TDLMS with constant step-size (2) = tmin. There- 


l2-This is the transform domain counterpart of the algorithm introduced in Section 2.2.3. We emphasize 
the difference between the NCVSLMS and the transform domain implementation which relies in the fact 
that in time domain we have used the median operation to test when the algorithm is at steady-state 


whereas in transform domain the average of the square error is implemented. 
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fore, the final level of the misadjustment is given by nin. The maximum value of p(n) 


can be very close to Umax but is always smaller than [maz. 


Analysis and the setup of the coefficients 


Here, we discuss about the selection of the coefficients of the TDNCVSLMS algorithm. 
We start with the condition that ensures the stability of the algorithm. As we can see 
from (3.70), the global component ju(n) of the new algorithm is bounded by [min < 
p(n) < Umar. Since the new algorithm performs as the standard TDLMS algorithm 
inside of each test interval, the stability analysis can be done using well known methods 
(see [45], [58] and the references therein). Following the derivations in [45] and making 
the usual assumptions, the TDNCVSLMS algorithm converges when 0 < Umar < - 

In order to compute the value of C(n), we will consider L consecutive iterations on 
which the overall step-size is constant. The value of C'(7) is set to be the steady-state MSE 
obtained at the output of the adaptive filter if the step-size is kept constant. In Section 
3.1 a simplified analytical expression for the steady-state MSE was derived therefore C(n) 
can be approximated by: 


C(n) = (1 + =Nuln) o° (3.71) 


If the noise variance cannot be accurately estimated then a penalty term ® can be intro- 
duced in (3.71) as follows: 


C(n) =® (1 + Vu(n)) om (3.72) 


where ® > 1 and close to 1. 

The value of Z in (3.70) has to be large enough such that the MSE can be approxi- 
mated and also it has to be smaller than the convergence time for an TDLMS with fixed 
step-size [min, Such that a sufficient number of step-size updates occur. In a large num- 
ber of simulations, for different signal to noise ratio, the following selection gives good 
performance L = 10 and a = 0.9. 


3.2.5 Simulations and results 


In this section the results obtained for the problem of system identification are presented. 
The compared algorithms, in this framework, are: the plain time domain LMS with 
constant step-size, the VSSLMS from [48], the RVSLMS introduced in [1], the CP-LMS 
from [61], the CP-VSLMS and the NCVSLMS described in Section 2.2, the plain TDLMS 
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with fixed step-size, the DCT-LMS introduced in [45] and the TDVSLMS, TDCPVSLMS 
and TDNCVSLMS described in this chapter. For the time-domain implementations the 
block diagrams used in the simulations are those from Section 2.2 whereas for the time 
domain implementations we have used the block diagrams shown in the previous sections. 

The unknown system has N = 16 constant coefficients and all adaptive filters were 
chosen of the same length. The transform used in all transform domain implementations 
was the DCT which gives real coefficients and also shows good decorrelation properties 
for many input sequences found in practice. 


The input sequence, was generated by the following autoregressive model: 
x(n) = 1.79a4(n — 1) — 1.852¢(n — 2) + 1.272(n — 3) — 0.41z(n — 4) + v(n) 


where y(n) is a white Gaussian random signal with zero mean and variance 0? = 0.14817. 

The eigenvalue spread ratio of the autocorrelation matrix, for this highly correlated 
input sequence was found to be 944.67. The signal to noise ratio at the output of the 
unknown system was 50 dB, and all the simulations were obtained by averaging 100 
independent runs of the algorithms. The parameters of all the compared algorithms are 
given in Table 3.1, and they were chosen such that the steady-state missadjustments 
are comparable (around 0.04). The selection of the parameters was done following the 
guidelines from the corresponding papers and the levels of the misadjustments, obtained 
experimentally, are shown in Table 3.2. 

The learning curves (the output excess MSE) for the compared algorithms are shown 
in Fig. 3.7 for VSSLMS, CP-VSLMS and NCVSLMS, in Fig. 3.8 for LMS, RVSLMS and 
CP-LMS, in Fig. 3.9 for TDLMS, DCT-LMS and TDCPVSLMS and in Fig. 3.10 for 
TDVSLMS, TDCPVSLMS and TDNCVSLMS. 

As expected, the time domain implementations do not perform well for such highly 
correlated input signal. Indeed if we compare the results shown in Fig. 3.7 and Fig. 3.8 
with those shown in Chapter 2 we can see that the convergence speed of the time domain 
adaptive algorithms is much smaller for highly correlated input sequence. We note that 
the learning curves shown in chapter 2 represent the mean squared coefficient error and 
not the excess MSE of the algorithms. The comparison can be made between this figure 
and the ones presented here if we take into account that the simulations performed in 
Chapter 2 were done with a random zero mean Gaussian-distributed input sequence and 
in that case the excess MSE and the mean squared coefficient error are proportional!?. 


13 Also the lengths of the filters in Chapter 2 were smaller which means that the speed of convergence 
would be larger. However the difference between the convergence speed of the time domain implementa- 


tions in the framework of Chapter 2 and in the framework of this chapter are much larger. 
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Table 3.1: The parameters of the compared algorithms (LMS, VSSLMS, RVSLMS, CP- 
LMS, CPVSLMS, NCVSLMS, TDLMS, TDVSLMS, TDCPVSLMS, TDNCVSLMS). 


LMS w=5-10-3 
VSSLMS y¥=1, mn =5-10-3, a=0.97, pmar = 3-102 
RVSLMS Umax =3:°10-2, a=0.97, pmin =5-10-°, B=0.99, y=1 
CP-LMS uy =3-10°?, L=1, po=5-10-3, T=100 
CPVSLMS uy =3-10-2, L=1, pmin=5-10-%, T=100, a=0.6 
NCVSLMS min =5-10-3, a=0.6, [maz =3-10-2, T = 100 
TDLMS w=5-10-3, B=0.9, «=2.5-10° 
DCT-LMS M=10, y=2-10-°, 6=0.9985, «=8-10~4 
Umar =5-1072, a=0.9, pmin =5-107, B=0.9, 
TDVSLM . 
uM €=2.5-10, L=10, y¥=10% 
min =5:10-3, M=3, [mar =5-:10-2, L=10, 
TDCPVSLM 
OEYere €=25-1072, a=06, B=0.9, 
TDNCVSLMS | plmin = 5: 1073, L=10, pinar =5-10-?, a=06, €=2.5-10- 
Table 3.2: The misadjustments of the compared algorithms. 
| LMS | VSSLMS | RVSLMS CP-LMS CPVSLMS' | NCVSLMS 
|M | 4.18% 3.98% 4.35% 4.20% 4.35% 4.23% 
| TDLMS | DCT-LMS | TDVSLMS | TDCPVSLMS | TDNCVSLMS 
| M | 3.91% 3.46% 4.64% 3.90% 3.96% 


In Fig. 3.9 and Fig. 3.10, the learning curves of the transform domain implementa- 
tions are presented. One can see from these plots that using the output error to adjust 
the global component of the step-size, the convergence speed of the transform domain 
implementations can be significantly increased. 

Of course other techniques which can improve the convergence may be included in 
addition to the step-size adaptation. One method is the selection of the orthogonal trans- 
form which must be chosen based on the properties of the input sequence. Another way 
to improve the convergence speed is to implement other expressions for power estimation 
instead of (3.8). Also, when (3.8) is implemented, initialization of the power estimates is 


important. In our simulations we have observed that better performances are obtained 
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Figure 3.7: Comparison between 
VSSLMS, CP-VSLMS, and NCVSLMS 
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Figure 3.9: Comparison of convergence 
curves for TDLMS, DCT-LMS and TD- 
CPVSLMS 
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Figure 3.8: Comparison between LMS, 
RVSLMS and CP-LMS 
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Figure 3.10: Comparison of convergence 
curves for TDVSLMS, TDCPVSLMS, and 
TDNCVSLMS 


when the power estimates are initialized with values close to the adaptive filter length”. 


All these techniques are well described in the referenced publications, therefore they are 


not detailed here. 


4 Actually the power estimates must be initialized close to the real powers of the transform coefficients 


for faster convergence. Due to the fact that in our simulations we have used the DCT and the power of 


the input sequence was unity, this initialization seems to be a good choice. 
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Algorithm | TDLMS | DCT-LMS | TDCPVSLMS | TDNCVSLMS | TDVSLMS 
5N+MN= 
Mult./Div. | 6N +1 =15N 12N +5 6N +8 6N +4 
(M = 10) 
3N + MN = 
Add./Sub. | 3N =13N 6N +3 3N +3 3N +2 
(M = 10) 


Table 3.3: Computational complexity of the TDLMS, DCT-LMS, TDCPVSLMS, TD- 
NCVSLMS and TDVSLMS algorithms. 


3.2.6 Comparison of the transform domain variable step-size 
LMS algorithms 


Here, we compare the transform domain implementations, described above, in terms of 
their computational complexity, memory load and simplicity of the parameter setup. To 
this end, in Table 3.3 the memory load and computational complexity of the TDLMS, 
TDCPVSLMS, TDNCVSLMS, DCT-LMS and TDVSLMS are shown. 

As expected, among all the algorithms, the TDCPVSLMS have a large complexity 
due to the use of two adaptive filters that work in parallel. Despite this fact, it possesses 
the faster convergence and the setup of its parameters is very simple. A single parameter 
[min Must be chosen in order to obtain a desired level of the misadjustment and the other 
parameters influence the convergence speed. We shall note that the expression a) aa 
is probably not the best choice to increase the value of the step-size. Other expressions 
that provide better results may be used instead. We emphasize here that the complexity 
of the DCT-LMS is also large, due to the calculation of the power estimates. Moreover 
the complexity of the DCT-LMS depends on the parameter MV. 

In the case of TDNCVSLMS, in the formula for step-size update, the noise variance 
o? is needed!®. Therefore it can be implemented in applications where o? is known 
or it can be estimated. Actually, the same discussion is valid also for TDVSLMS and 
DCT-LMS algorithm if we take into account the analytical expression to estimate their 
misadjustment (see (3.61) and (3.42) respectively). These two equations are used in order 
to setup the values of the parameters of both algorithms and in both equations the value of 
the minimum MSE is included. In system identification applications, the minimum MSE 


equals the noise variance therefore, also these two algorithms require some information 


15 As in the case of its time domain counterpart. 
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about o?. If we refer to the time domain implementations, VSSLMS and RVSLMS the 
same discusion is valid for their parameter’s setup. The TDCPVSLMS and CP-VSLMS 
does not have this problem since their misadjustments depends only on the minimum 
bounds of their step-sizes. 

Finally, the advantages and disadvantages of the transform domain implementations 


are synthesized in Table 3.4. 


TDLMS | DCT-LMS | TDCPVSLMS | TDNCVSLMS | TDVSLMS 
Complexity | small small large small small 
Speed slowest fast fast fast fast 
Setup simple | needs Jmin simple needs Jinin needs Jmin 


Table 3.4: Advantages and disadvantages of the transform domain algorithms. 


3.3. Transform domain LMS algorithm with optimum 


step-size 


In Section 3.1.2, we have given a brief theoretical analysis of the TDLMS for tracking time- 
varying channels. Using some common assumptions, we have obtained a simplified formula 
(3.35) which describes the steady-state MSE of the algorithm. From (3.35), it follows 
that the dependence between the steady-state MSE and the step-size is nonlinear and 
the steady-state MSE has three components. One component o? which is the minimum 
level of the MSE achieved in the case of perfect adaptation, the second component souN ‘ 
proportional with the step-size , and represents the component due to the imperfect 
adaptation of the coefficients. Finally, the third part y~'tr [RpQp] is due to the time 
variations of the channel coefficients and is inversely proportional with the step-size. 


The primary goal of the algorithm introduced here is to adaptively adjust the step-size 


mse 


opt. Which minimizes the steady-state MSE. We emphasize that 


p toward the optimum ju 
the steady-state MSE and the steady-state mean squared coefficient error are minimized 


mse 


opt.» One needs 


by different step-sizes which are expressed in (3.36). In order to compute ju 
to know the trace of the matrix RpQp and the variance of the output noise 02. When 


mse 


these two values are available, the computation of j¢/;° is trivial. Here we assume that 


this information is not available, and we propose an iterative method for computing the 


Mse 


optimum value of the step-size yjry°. 
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For the simplicity of the exposition we make the following notations regarding (3.35): 


B = 4#r[R,Q,] (3.73) 


a) 


B 
With these notations, equation (3.35) can be written as Js,=2A+puNA+—. If we 

m 
consider two adaptive TDLMS filters with equal lengths N and different step-sizes p14 
and po, their steady-state mean squared errors can be approximated by the following 


expressions: 


B B 
Jet, =2A+piNA+— and Je, =2A+ p2NA+ —. (3.74) 
by be 
In (3.74), the length N and the step-sizes 4, and lg of the adaptive filters are known. 
Also some estimates of Js:, and Js, can be obtained for instance by averaging the output 
squared errors. As a consequence, the system of equations in (3.74) can be easily solved 
in order to compute the unknowns A and B. The solution is given by the following 
expression: 
J st, [M1 = J sty [ho 


i (141 — fa) [2+ N (ta + po)]’ B= jn (Jet, (00) — 2A — NA) (3.75) 


The optimum step-size, which minimizes the steady-state MSE can be computed by the 
following formula: 
mse B 
ont = NA (3.76) 
The proposed approach for step-size adaptation introduced in the sequel is based on 
this concept, which uses two adaptive filters with equal lengths and different step-sizes. 


3.3.1 The proposed implementation 


Based on the above derivations, we propose the following TDLMS with adaptive step-size 
whose block diagram is depicted in Fig. 3.11. The new algorithm contains two adaptive 
TDLMS filters with equal lengths N that operate in parallel. The first adaptive filter 
with coefficients h,(n) has a fixed step-size j1; while the second adaptive filter h(n) has 
a variable step-size ji2(n) which is adapted using the following formula: 


B : — 
ftaanss (ay i ea hhh. i) 


[l2(n), otherwise 
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Figure 3.11: The block diagram of the proposed transform domain adaptation of the 


step-size. 


The parameters A and B are computed using (3.75). We note that A and B have 
constant values due to the fact that the input x(n) is stationary and the covariance matrix 
Q, has constant elements. 

As we can see from (3.77), the behavior of the proposed algorithm can be described 
as follows: for a number of L consecutive iterations, called the test interval, the step-sizes 
of both adaptive TDLMS filters are constant. At the end of the test interval, the output 
MSE for both adaptive filters (Js:, and respectively J,:,) are computed and the step-size 
}l2(n) is updated according with (3.77). To compute the output MSE of the two adaptive 
filters we propose the following analytical expression’®: 


I(n) =5A(n—1)+-d + Yo (i, 
iy (3.78) 
y(n) =6.(n-1I) +(1—-d)4 Yo ela) 


i=n—L+4+1 


with 0 < 6 < 1 being a constant parameter. 
We should emphasize that in (3.78) the values of the output MSE at time instant n 
are computed whereas in (3.75) the steady-state MSE values are necessary. Therefore, at 


16Other expressions which give better approximation of the output MSE can be used as well. 
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the beginning of the adaptation, the value of the step-size i2(n) is far from the optimum 
due to the use of the transient mean squared error. For this reason we do not update the 
step-size /19(n) just once during the adaptation but we update it many times using (3.77). 
When both adaptive filters go near the steady-state, the estimates in (3.78) are close to 
the steady-state values of the MSE and the step-size f2(n) converges to the optimum. 

In all our experiments we have used a fixed value of the parameter L, which is the 
length of the test interval. Another possibility, which is not addressed here, is to use a 
time-varying test interval, for instance proportional with the time constant of the algo- 
rithm. 

We have tested the proposed algorithm in system identification framework depicted 
in Fig. 3.11. The output noise v(n) was a Gaussian zero mean sequence with variance 


o? = 25 x 10~*. The input sequence x(n) was given by the model: 
x(n) = ya(n — 1) + O(n) 


where 7 = 0.9 and @(n) is a random Gaussian-distributed sequence with zero mean and 
variance chosen, such that the variance of x(n) was unity. 

The lengths of the unknown system and of the adaptive filters were equal to N = 10. 
The step-size of the first adaptive filter was chosen jz; = 5 xX 107° while the step-size of 
the second adaptive filter was initialized with p2(0) = 107°. 

To update the coefficients of both adaptive filters we have used (3.10) and the powers 
of the transform coefficients were estimated by (3.8). The parameter used to estimate the 
powers of the transform coefficients was a = 0.9 and the coefficients used to estimate the 
MSE in (3.78) was 6 = 0.9. The length of the test interval in (3.77) was chosen L = 50. 
The time-varying unknown system was modelled by (3.20) and the increments e(n) of the 
time-varying channel coefficients were random zero mean sequences with variance 10~°. 

The plotted results are obtained by averaging a number of 100 independent runs each 


run containing a number of 5 x 10? iterations. 


3.3.2. Simulations and results 


The behavior of the expected value of the step-size F [j12(n)|], during the adaptation, 
is depicted in Fig. 3.12 together with the value of the optimum step-size which was 
computed from (3.36). We can see from this figure that the step-size of the proposed 


mse 
opt * 


to the transient periods of both adaptive filters. 
The steady-state MSE at the output of the first adaptive filter with fixed step-size 
iy = 5 x 107? was Jy;, = —24.2760 dB while for the second adaptive filter with adaptive 


algorithm converge close to the optimum ju The transient period in Fig. 3.12 is due 
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Figure 3.12: Step-size behavior during adaptation. 


step-size [2(n) the steady-state MSE was found to be Jg:,(n) = —25.0490 dB. Clearly 
there is a reduction on the steady-state MSE when a step-size close to the optimum is 


used. 


3.4 The Scrambled Least Mean Squared algorithm 


The LMS algorithm is widely utilized in adaptive filtering for several reasons. Its principal 
characteristics responsible for attracting the users are low computational complexity, clear 
convergence analysis in stationary environment, unbiased convergence in the mean to the 
Wiener solution, stable behavior when is implemented with finite-precision arithmetic, 
it is straightforward to setup and there is a single parameter to be pre-defined [25], 
[41]. However, the convergence speed of the LMS algorithm depends on the eigenvalue 
spread of the input autocorrelation matrix R, as we have seen in the first chapter of this 
thesis. For highly correlated inputs, the eigenvalue spread of R is high leading to a slow 
convergence of the LMS. During last decades there has been a large interest in adaptive 
filtering community to improve LMS convergence properties without affecting too much 


computational complexity. One important class of LMS like adaptive algorithms is the 
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class of variable step-size LMS algorithms and some of its components were described also 
in the first chapter of the thesis. The main idea of the VSSLMS algorithms is to use a 
time-variable step-size that has large values when the algorithm is far from the optimum, 
to increase the convergence speed, and smaller values near the steady-state to obtain a 
small missadjustment. Although, the VSSLMS algorithms have shown improved speed of 
convergence for uncorrelated input signals their behavior for highly correlated signals is 
still poor as we have illustrated in the simulations shown in Chapter 3. Another attempt 
to improve the convergence speed and the steady-state misadjustment of the LMS is the 
cost function adaptation [71], [72], which does not make the subject of this thesis. 

An alternative way to increase the convergence speed of the LMS is to modify not only 
the coefficients update equation (by implementing a variable step-size or cost function 
adaptation), but to change also the statistics of the input signal, such that the input 
autocorrelation matrix will be as close as possible to the identity matrix. When the 
input autocorrelation matrix R is the identity matrix (or near identity matrix) all the 
convergence modes of the adaptive filter are equally excited and the convergence speed is 
improved. The algorithms that uses an orthogonal transform to diagonalize the matrix 
R belongs to the class of the Transform Domain LMS adaptive algorithms, such as those 
presented earlier in this chapter. 

In the communication community the technique of scrambled transmission is well 
known. When a secure communication is needed, the transmitted signal is transformed 
in such a way that its information content is unintelligible to a third part and this trans- 
formation can be made by means of a scrambling device. The main classes of scrambling 
devices are the digital scrambler and the analog scrambler. The advantage of the digital 
scrambler is its higher degree of security comparing with the analog counterpart. Al- 
though the digital scrambler offers a higher degree of security its main drawback is the 
fact that the resulting waveform (the scrambled signal) occupies a much higher bandwidth 
than the baseband unscrambled signal. 

While scrambling was initially introduced for the reason of securing the data trans- 
mission, in digital communication systems it also provides a source of pseudodata for 
adjustment of the timing and Automatic Gain Control (AGC). Consequently, various 
subsystems in data communication systems, such as equalizers and echo canceler, work 
better with uncorrelated input sequences. Scrambling is also a way for whitening a cor- 
related signal, such that the convergence speed of the adaptive filters operating with 
scrambled signals is improved [3]. The scrambled sequence must be unscrambled at the 
receiver in order to preserve overall bit sequence transparency. 


From the above considerations one can say that the transformation of the input signal 
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Figure 3.13: A full duplex speech communication channel. 


by means of an orthogonal transform and by means of a scrambling device represents 
two ways to improve the convergence speed of the adaptive filters. The question is which 
technique is of most interest in practical implementations. In this section, we focus on 
the application of Transform Domain LMS (TDLMS) and Scrambled LMS (SCLMS) for 
the problem of transmission of digital data through a telephone channel. Although, the 
Scrambled LMS does not use an orthogonal transformation at the input, we can consider 
that it is based on the same principle of whitening the input sequence as the transform 
domain LMS, therefore we have include this discussion in the present chapter. 


3.4.1 Problem formulation and theoretical background 


In this section, we study the problem of full-duplex digital transmission over a telephone 
line as depicted in Fig. 3.13. Ideally, all the energy of the transmitted signal from the 
transmitter A has to be received by the receiver B and vice-versa. Since the hybrid 
terminations are not ideal, a small part of the transmitted data signal goes to the local 
receiver (through the local echo path) and disturb the communication. This signal is 
called the local echo. Another source of an echo signal represents the reflected signal 
due to the imperfect impedance adaptation at the ends of the telephone line. Here we 
concentrate only on the local echo cancellation problem. The block diagram for echo 
cancellation at emitter A is depicted in Fig. 3.14. 

The main problem in the case of local echo cancellation is to approximate the transfer 
function of the local echo path h using an adaptive FIR filter h(n) and then to subtract 
the estimated echo y(n) from the returning echo y(n), such that the resulting echo is 
minimized. 


When the LMS algorithm is used to modify the coefficients of the adaptive filter h(n), 
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Figure 3.14: Adaptive local echo cancellation block diagram. 
the update equation is given by: 


t 


h(n + 1) = h(n) + pe(n)x(n), (3.79) 
is the N x 1 vector of the adaptive filter co- 


where h(n) = fia(n), ho(n),.-.fiv(n) 
efficients, x(n) = [x(n), a(n —1),...,2(n — N +1)]' is the N x 1 vector containing the 
past N samples of the input sequence x(n), is a constant step-size that controls the 
convergence speed and ensures the stability of the algorithm and e(n) is the output error 
which is computed as follows: 


e(n) = y(n) + u(r) — y(n), (3.80) 


with y(n) = h'x(n), 9(n) = h'(n)x(n) and v(n) being the output of the echo path 
(local echo), the output of the adaptive filter (estimated local echo) and the output noise, 
respectively. Usually, the sequence v(n) contains the transmitted signal from emitter B 
plus the reflected echo and the channel noise. For the simplicity of exposition and without 
loss of generality we consider here that the reflected signal and the transmitted signal from 
emitter B are zero. This corresponds to the situation when the transmission is done just 
from user A to user B and the hybrid connections at the receiver are perfect. 

The convergence speed of the LMS algorithm is governed by the eigenvalues of the 
input autocorrelation matrix R = FE {x(n)x‘(n)}. If some of the eigenvalues of R are 
very small then the convergence of the LMS in the direction of these eigenvalues will be 
slow resulting in a slow overall convergence of the algorithm. A good measure of the 


convergence speed of the LMS is the eigenvalue spread y(R) of the input autocorrelation 
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matrix R, computed as the ratio between the largest eigenvalue to the minimum eigenvalue 
of R. For high values of y(R), the LMS has slow convergence. If the input signal x(n) 
is uncorrelated then the matrix R equal the identity matrix and has equal eigenvalues 
resulting in y(R) = 1. For correlated input sequences the eigenvalues of R are not equal 
and y(R) might be much larger than unity. In order to speed up the convergence of the 
LMS when operating with correlated signals the class of Transform Domain LMS was 
introduced. 


3.4.2 Analysis of the LMS algorithm for digital data transmis- 
sion 


We study here the behavior of the LMS adaptive algorithm in the special case when the 
transmitted sequence is constant with all its samples equal to +1. The input autocorre- 


lation matrix for this special case can be written as follows: 


lee Lee | 
As i ade “th 


R= (3.81) 
4 


which does not admit an inverse. 


The system of equations used to compute the Wiener solution can be written as follows: 
Rh, = Rh, (3.82) 


where h, is the vector of the optimum coefficients and h is the vector of the echo path. 
Since the autocorrelation matrix does not admit an inverse it is not possible to pre- 

multiply (3.82) by R~! to obtain the optimum coefficients. Moreover, taking into account 

the structure of matrix R shown in (3.81), the system of N equations in (3.82) reduces 


to a single equation: 
No, + Rog +717 + Roy = ha + ho t-++ + hy (3.83) 


Clearly equation (3.83) has multiple solutions which suggest that the LMS does not 
have a unique convergence point. If we take into account the update equation of the 
adaptive filter coefficients (3.79) we can see that all coefficients are updated by the same 
quantity since the elements of the vector x(n) are all equal. It follows from this fact that 
the coefficients of the adaptive filter are all equal at every time instant, provided that 
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they are initialized with the same value!”. Using this observation and (3.83) we can find 
the convergence point of the LMS algorithm as follows: 


N 
1 
1 = hog = ++ = Row wo (hi + ho +--+ + hy) (3.84) 
i=l 


The steady-state MSE can be approximated by the well known expression: 


J sé = Br a 


Ee TR fe (Las |" (3.85) 
2 2 
For this special case, the matrix R has N — 1 eigenvalues equal to zero and one 
eigenvalue equal to N and the eigenvalue spread of R equal infinity. This fact suggests 
that the algorithm might not be convergent. Actually, this is not true due to the fact that 


for this case of equal inputs, the system is equivalent to the one in which the echo path 
N 
have just one coefficient equal to the sum H = > h; and also the adaptive filter have just 


i=l 
one coefficient, which converges to H. Due to this fact the convergence speed is expected 
to be N times faster when the inputs are equal than in the case of non-equal inputs. 
The minimum MSE can be computed as in the sequel: 


Imin = E {e5(n)} = E {(y(n) — Go(n) + v(n))"} = of + E {(y(n) —Goln))?} (3.86) 


where ¥,(n) is the optimum output obtained in the case of perfect adaptation and it can 


be written as: 
N N 


i) =) GP Sa hig = fe (3.87) 


i=1 t=] 
where in (3.87) we have taken into account that x(n) = 1 at each time instant n. 
Replacing (3.87) in (3.86) and using (3.84), the minimum MSE at the output of the 
adaptive filter equals Jinin = 07 and the steady-state MSE is expressed by the following 
analytical result: 


N 
Ja = 0? (1 Me) (3.88) 


In conclusion, the convergence point of the LMS when operating with constant input 
sequences is different than the convergence point when the inputs are not equal. This 
situation is actually equivalent with the one in which the echo path and the adaptive filter 
have just one coefficient. However, the minimum and the steady-state MSE for equal and 


non-equal inputs are the same. 


I7Tf the coefficients are initialized with different values the convergence point is different. 
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Figure 3.15: The block diagram of the adaptive echo cancellation in transform domain. 


3.4.3 Analysis of the Transform Domain LMS algorithm for dig- 


ital data transmission 


To implement the Transform Domain LMS, the block diagram from Fig. 3.14 is modified 
as shown in Fig. 3.15, by the introduction of the block denoted as T which represents 
the orthogonal transformation applied to the input signal x(n). The coefficients update 


formula becomes: 
h(n +1) = h(n) + pI! (n)e(n)s(n), (3.89) 


where s(n) = T*x(n) is the vector of the transform coefficients, T represents the matrix 
of the orthogonal transformation, I'(n) is a diagonal matrix having on the main diagonal 
the power estimates of the transform coefficients and e(n) is the output error computed 


as follows: 


e(n) = y(n) + u(r) — y(n), (3.90) 


and 7(n) = h'(n)s(n). 
The equation which gives the optimum coefficients of the adaptive filter in transform 


domain was derived in the previous sections at it was found to be: 
R,hr, = R,Th. (3.91) 


Usually, the equation (3.91) is pre-multiplied with Rz' and the optimum solution is 
hop» = Th. In the special case, when the digital data transmitted through the com- 


munication channel represents a long string of constant values (for instance when the 
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transmitter sends just +1s), the matrix R, has a special structure and it does not admit 
an inverse. If the input sequence x(n) consists of a long string of +1’s, then the vector 
x(n) will have all its elements equal to +1s and the transformed vector s(n) will have 
the first element non-zero and the other elements equal to zero. Therefore, in this special 
case the structure of the matrix R, is expressed by: 


| TW 0 0 | 
r.-|®9 ®%- 20 (3.92) 
0 oO ... O 


Using (3.92) in (3.91) we can conclude that the first element of the vector hi, is equal 
to the first element of the vector Th. Moreover, from the update equation (3.89) we 
can see that just the first element of the vector s(n) is non-zero which implies that just 
first element of h(n) is adapted and the others remain unchanged. If the vector h(n) is 
initialized by zeros, then the optimum vector will be: 


hr, = [Thy,0,...,0]’. (3.93) 


where by Th, we denoted the first element of the vector Th. 

The steady-state MSE can be obtained following a procedure similar to the one in 
Section 3.2 of this thesis. Finally, the following analytical expression gives the value of 
the steady-state MSE: 


I stig = Imin + tr [R,C(co)| (3.94) 


where C(co) = lim E { Ai(n) Ah'(n) }, Ah(n) = h(n) — hy,, Rg is given in (3.92) and 
Jmin 1S the minimum MSE obtained in the case of perfect adaptation when the coefficients 
of the adaptive filter equals hr,. 

Taking into account (3.92), we can write (3.94) in the following form: 


J stra = Imin + P11C 11 (00), (3.95) 


where C1(n) = hy(n) —h,, and h,, is the first element of hy,. 
From the above equations, the analytical expression for the steady-state MSE can be 


obtained after some simple mathematical manipulations, as follows: 
1 


From (3.96) we can see that when the input sequence is constant, the steady-state 
MSE does not depend on the length N of the adaptive filter. This is expected since in 
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the adaptation process just one coefficient of the adaptive filter is modified. Actually, this 
situation is similar with the case in which the adaptive filter have just one coefficient. 


The minimum MSE J,,;,, can be obtained in the following way: 
Imin = E {e2(n)} = E {[y( — yo(r)]"} (3.97) 


N N 
where y(n) = 9) x(n —74+ 1)h; and the optimum output is given by y(n) = >> s;(n)ho, 
i=1 i=1 
with s;(n) being the i” element of the transformed input vector s(n) and ho, is the i” 
element of the optimum vector hr,. 
For a constant input sequence (say x(n) = 1) just the first element of s(n) is non-zero 


and the others are equal to zero. In this case equation (3.97) simplifies to: 


Bee to (so — 81(n)ho, + u(n ) (3.98) 


When the DCT transform is used, for the decorrelation of the sequence x(n), the input 


vector s(n) and the optimum vector hy, can be written in the following manner: 


t 1 t 
s(n) =| Se @, ose 0 | and m= | eh Oo ses | (3.99) 


Using (3.99) in (3.97), the minimum MSE is expressed by: 


= Ne Se c ; : 
Jin = E (Som - Fee oh +) = E{v?(n)} =o? (3.100) 


Finally, the steady-state MSE is obtained combining (3.96) and (3.100), and it is 
expressed by: 


1 
i So. (: + 5H) (3.101) 


It follows from the above analytical result that the transform domain LMS converges 
to an optimum solution, which is not equal to the coefficients of the echo path when the 
inputs are all equal. Moreover, the steady-state excess MSE is N times smaller than the 
steady-state excess MSE of the time domain LMS which uses the same step-size. 
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Figure 3.16: A full duplex scrambled communication channel. 


3.4.4 The Scrambled LMS 


A block diagram of a full duplex scrambled transmission over a telephone line is depicted 
in Fig. 3.16. We can see that in the case of a scrambled transmission the blocks denoted 
by Scrambler and Descrambler are introduced. Their main goal is to secure the data 
transmission. When a message has to be sent from the user A to the user B, the signal 
is scrambled prior to the transmission. The user B receives the scrambled signal and 
using a corresponding descrambler, decodes the transmitted data. Here, we focus on the 
application of the digital scrambler/descrambler. 

From the adaptive filtering point of view an important fact is that the scrambler acts 
to ’whitten” the data to be sent. For modem components, such as the echo canceler and 
the equalizer, to function properly, a common assumption has to be made that the data 
is random and i.i.d. (independent and identically distributed). This assumption can be 
easily violated since long sequences of equal samples can be sent. The scrambler tries to 
ensure this pretext by making the bit sequence to look random and the input symbol data 
Xs-(n) are uncorrelated. 

The block diagram of the local echo cancellation in the case of scrambled transmission 
is depicted in Fig. 3.17. The difference between Fig. 3.17 and Fig. 3.14 is that the 
input sequence a(n) is transformed by the block denoted Scrambler and the resulting 
sequence 2,-(n) is transmitted over the telephone line. This is why the scrambler device 
appears also at the input of the echo path in Fig. 3.17. We note that the orthogonal 
transformation T appears in Fig. 3.15 just at the input of the adaptive filter. 

When the LMS algorithm is used to update the coefficients of the adaptive filter the 
update equation is exactly the same as (3.79) with the only difference that the input signal 


18For the case of a constant input sequence x(n) = 1. 
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Figure 3.17: Scrambled adaptive local echo cancellation. 


into the adaptive filter and into the echo path is now the scrambled sequence x-(7). 
It is easy to show that the optimum solution when the Scrambled LMS algorithm is 
used to update the coefficients of the adaptive filter is h,., = h and the steady-state MSE 


can be expressed by’: 


1 
Jstes = 0, + jhtr [Rise] 0? 


vt 


(3.102) 


tse 


where R,,. is the autocorrelation matrix of the scrambled sequence x,.(7). 

Actually, when the Scrambled LMS is implemented the scrambler does not only per- 
mute the input samples but it inserts some —1’s between the samples of the input sequence 
x(n) to obtain the scrambled sequence x,-(n). In this way, the autocorrelation matrix R,. 
has the diagonal elements equal to 1 and very small off-diagonal elements. Due to this 
fact, the analysis of the Scrambled LMS can be made in a similar manner as in Chapter 
2. 


3.4.5 Comparison between scrambled LMS, transform domain 


LMS and time domain LMS for echo cancellation 


In the system identification applications, such as the echo cancellation, the common way 


to measure the performance of an adaptive filter is the Normalized Estimation Error 


'9The derivation of the steady-state mean squared error can be made as in Chapter 2 using the as- 


sumption that x(n) is iid. 
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(NEE) defined as (see [71], [72]): 


ey 2 
NEE(n) = Oleg. for LMS and SCLMS (3.103) 
Th — h(n)||? 
NEE(n) = Olona for TDLMS (3.104) 


At the steady-state, taking into account that h(co) has null elements except the first one, 
equation (3.104), for the transform domain LMS, becomes: 


“A 2, N 
1=2 


NE Eta(co) = 10logio . (3.105) 


N 
>_ Th; 


w=1 


It follows from (3.105) that, even in the case of a perfect adaptation when hy(oo) = 
Thy, the steady-state level of the NEE cannot be made very small due to the terms Th,, 
i = 2,N which are not vanished by the corresponding adaptive filter coefficients. On 
the other hand, the steady-state MSE in (3.101) can be reduced to a desired level by 
decreasing the value of the step-size ju. 

A similar discussion can be made for the LMS algorithm since the coefficients of the 
adaptive filter do not converge to the coefficients of the echo path but they are all equal to 
the average of h;2°. As a consequence, the steady-state NEE cannot be made very small 
decreasing the step-size jz. The steady-state MSE on the other hand is proportional with 
the step-size and the length of the adaptive filter as seen in (3.88). 

Due to the fact that the optimum coefficients of the Scrambled LMS equal the coeffi- 
cients of the echo path, the steady-state NEE and the steady-state MSE decreases when 
the step-size is decreased. 

Since in the case of echo cancellation the interest is to minimize the output error 
between the echo y(n) and its estimate y(n), the MSE is the best choice to measure the 
performances of the TDLMS algorithm. Actually, in echo cancellation application the 
interest is not to identify the echo path but to reduce the local echo. As a consequence, 
the best measure to compare two algorithms ( for instance LMS, transform domain LMS 
and scrambled LMS) is to compare their output MSEs. 


20This is true when the coefficients of the LMS are initialized with zeros. 
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3.4.6 Simulations and results 


Here we study the performances, in terms of MSE and mean squared coefficient error 
of the time domain LMS, scrambled LMS and the transform domain LMS in the echo 
cancellation framework for digital data transmission. The length of the echo path and of 
the adaptive filters were N = 9 in all experiments. In order to illustrate the analytical 
results from the previous section, two different experiments were performed. In the first 
experiment, we have studied an extreme case where the input sequence was a constant 
signal, whereas in the second case we tested the three algorithms when the input sequence 
was a binary sequence with elements from the set {—1;1}. However the samples of the 
input sequence in the second experiment were correlated in the sense that long strings 
of —1’s and +1’s were included in x(n). Although a constant input sequence as in the 
first experiment is unlikely to appear in practice, we have chosen to include here this 
example in order to get a more clear insight about the adaptation mechanism for these 
three algorithms. 

First experiment: The block diagram used to implement the LMS, SCLMS and 
TDLMS algorithms are those depicted in Fig. 3.14, Fig. 3.15 and Fig. 3.17 respectively. 
The input sequence x(n) have a constant level of +1. The step-sizes used in the LMS 
and SCLMS were made equal whereas for the TDLMS we have used a step-size that is 9 
times larger??. 

The values of the first coefficients of the TDLMS and Th, during the adaptation, are 
shown in Fig. 3.18. We can see that hi(n) converges to the first element of the vector 
Th and the other coefficients of the TDLMS were zero, which is in agreement with the 
theoretical results. 

The value of the first coefficient during the adaptation together with the average of 
the coefficients of the echo path are shown in Fig. 3.19 for the LMS algorithm. The 
plotted learning curves show a good agreement with the analytical result of (3.84) (the 
other coefficients of the adaptive filter were equal to the first coefficient). 

In Fig. 3.20 and Fig. 3.21 the second and the fifth coefficients of the adaptive filter are 
plotted for the scrambled LMS together with the corresponding coefficients of the echo 
path model. Also, these figures have shown a good agreement with the theory (we see 
that the coefficients of the SCLMS approximate the coefficients of the echo path h). 

In Fig. 3.22, the Normalized Estimation Error (see (3.103) and (3.104)) is plotted 
for the TDLMS and SCLMS. We can see from this figure, that the steady-state level of 
the NEE in the case of TDLMS is higher than in the case of SCLMS which is due to 


21This was suggested by the expression of the misadjustment in (3.96). 
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Figure 3.19: The behavior of the first co- 
efficient of the LMS and the average of the 


Figure 3.18: The behavior of the first co- 
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Figure 3.20: The behavior of the second 
coefficient of the SCLMS and the second 
coefficient of the echo path model in the 


first experiment. 


Figure 3.21: The behavior of the fifth co- 
efficient of the SCLMS and the fifth coef- 
ficient of the echo path model in the first 


experiment. 


the fact that some coefficients of the adaptive filter are not updated. The reason of this 


phenomenon is the fact that the transformed input vector s(n) have just one non-zero 


element. 
Finally, the excess MSE for the LMS, SCLMS and TDLMS are shown in Fig. 3.23, 
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Figure 3.22: Normalized estimation error 
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Figure 3.24: The excess MSE for the Figure 3.25: The excess MSE of the 
SCLMS in the first experiment. TDLMS for the first experiment. 


Fig. 3.24 and Fig. 3.25 respectively. We can see that the TDLMS performs better than 
the SCLMS in terms of MSE, for this particular input sequence. The step-size of the 
TDLMS was 9 times larger than the step-sizes of the LMS and SCLMS as suggested by 
(3.96) and (3.102), in order to obtain the same level of the steady-state missadjustment. 
We can see that both filters converge to the same level of the MSE although they have 
different step-sizes, which proves the validity of (3.96). 

Second experiment: The same block diagrams were used to implement the three 


compared algorithms. The input sequence was bipolar with elements in {—1,+1} and it 
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Figure 3.26: The first coefficient of the 
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Figure 3.27: The third coefficient of the 
TDLMS and its optimum value during the 


adaptation (second experiment). 


—- LMS 
-©- SC_LMS 
-~- TDLMS 


—30 


200 400 600 800 1000 1200 1400 1600 1800 2000 
Iterations 


Figure 3.28: Normalized estimation error for the LMS, SCLMS and TDLMS in the second 


experiment. 
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Figure 3.29: The excess MSE for the LMS, SCLMS and TDLMS in the second experiment. 


contains long strings of consecutive —1’s and +1’s. In this case, the input autocorrelation 
matrix is not diagonal and the eigenvalue spread was larger than unity. 

The behavior of the first and the third coefficient of the TDLMS together with the 
first and the third coefficient of Th are plotted in Fig. 3.26 and Fig. 3.27 respectively. 
We can see that in this case the coefficients of the TDLMS converge near Th. However, 
from the learning curves shown in Fig. 3.28 we can see that also in this case the TDLMS 
converges to a higher level of the NEE comparing to the LMS and SCLMS. This is due 
to the fact that during the adaptation the long strings of input samples with constant 
values are transformed by the orthogonal transform into vectors having just one non-zero 
coefficient and the other coefficients are zero. Because of this phenomenon, the coefficients 
of the adaptive filter are not enough times updated resulting into a higher level of the 
steady-state NEE. 

The excess MSEs obtained with the LMS, SCLMS and TDLMS are plotted in Fig. 
3.29. We can see that TDLMS and SCLMS converges faster than the LMS for correlated 
input sequence. This faster convergence was expected since both algorithms uses the 
decorrelation of the input sequence which increases the convergence speed. 


Here we have analyzed the problem of local echo cancellation for digital transmission 
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over a telephone. For this practical application we have studied the behavior of three 
different adaptive algorithms, the Least Mean Squared (LMS) algorithm, the Scrambled 
LMS (SCLMS) and the Transform Domain LMS (TDLMS) algorithm. 

We have found that in the case of TDLMS the level of the steady-state NEE is higher 
compared to LMS and SCLMS, when the input sequence contains long strings of con- 
secutive equals samples and it cannot be decreased due to the fact that there are some 
coefficients that are not enough times updated. However, all algorithms converge to the 
same level of the steady-state MSE which is of interest in practical applications, when the 
step-sizes are appropriately chosen. 

For input sequences which contains long string of equal samples, the convergence speed 
of both TDLMS and SCLMS are comparable and higher than the convergence speed of 
the LMS. This result is expected since it is well known that the LMS converges slower for 
correlated inputs than other algorithms which perform the decorrelation at the input of 
the adaptive filter. 


Chapter 4 
Applications 


In this chapter, we discuss about the implementation of various adaptive algorithms intro- 
duced in the previous chapters for three practical applications. First, we address the prob- 
lem of channel equalization for Gaussian and non-Gaussian noise environments. Second 
the behavior of different adaptive algorithms for the problem of Code-Division Multiple- 
Access using the Direct-Sequence (DS) spread spectrum signaling is discussed. Finally, 
the problem of echo cancellation for digital data transmission is addressed. Computer 
experiments showing the results obtained with the compared algorithms are presented 
and the advantages and the disadvantages of the implementations are discussed. 


Section 4.1 is dedicated to the problem of channel equalization and the algorithms 
which are implemented for this problem are the order statistics LMS, the plain time 
domain LMS, the variable step-size LMS in time domain and the transform domain im- 
plementations (the plain transform domain LMS and transform domain LMS algorithms 
with variable step-size). All these algorithms were described in the previous sections of 
the thesis. We first briefly describe the problem of channel equalization and the block 
diagrams of the time domain and transform domain implementations are presented. In 
Section 4.1.1 the order statistic LMS algorithms are implemented in channel equalization 
framework for non-Gaussian channel noise whereas in Section 4.1.2 the time domain and 
transform domain algorithms are compared for the situation in which the channel noise 
is Gaussian distributed. 

In Section 4.2 the problem of CDMA multiuser detection is addressed. We shortly 
describe first the framework in which the adaptive filters are implemented and then the 
simulations and results obtained with the time domain adaptive algorithms with variable 
step-size are shown. For this application we have implemented the following adaptive al- 
gorithms: the plain LMS, the variable step LMS proposed in [48], the robust variable step 
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LMS proposed in [1] and the complementary pair variable step LMS algorithm described 
in Section 2.2. 

In the last part of this chapter, an extension of the Variable Length LMS for correlated 
input sequence is introduced. The proposed algorithm is a combination of the VLLMS 
algorithm introduced in Chapter 2 and SCLMS presented in Chapter 3. The performances 


of the new implementation are studied in echo cancellation framework. 


4.1 Channel equalization 


For the problem of channel equalization, we study the performances of two main classes 
of adaptive filters namely the time domain implementations and the transform domain 
implementations. The block diagram implemented for channel equalization in time do- 
main is depicted in Fig. 4.1, where h represents the time invariant linear communication 
channel, v(m) represents the channel noise, h(n) is the vector containing the coefficients 
of the adaptive filter, x;,(m) is the sequence transmitted through the channel and 2 o,4(7) 
the output sequence from the channel. 

The noise channel is added at the output sequence x,,;(7) and the result x(n) repre- 
sents the input into the adaptive filter. The coefficients of the adaptive filter are updated, 
such that the MSE at the output of h(n) is minimized. The error is computed as the 
difference between the desired sequence d(n) and the output of the adaptive filter and the 
sequence d(n) is obtained as a delayed version of the transmitted signal x;,,(n). Actually, 
in practice the sequence d(n) is stored in the receiver and during the adaptation period 
the same sequence is transmitted through the channel. During the training period, the 
sequence X;,(n) and its delayed version d(n) does not contain useful information. After 
the training period the coefficients of the adaptive filter are maintained constant and the 
sequence which contains the useful information is transmitted. This method is known as 
the training based channel equalization since the transmition of the useful information is 
interrupted from time to time in order to perform the equalization of the channel during 
the training period. 

For transform domain implementations the same scheme of training-based channel 
equalization is implemented as shown in Fig. 4.2. The difference between the two figures 
consists in the block denoted as T which appears in Fig. 4.2. This block represents 
the orthogonal transformation applied at the input of the adaptive filter, such that the 
output from the channel plus the channel noise 7(n) = Xou(n) + v(m) is first transformed 
to s(n) = Tx(n) and then applied to the adaptive filter. 


In all simulations presented here the transmitted sequence is bipolar with values ran- 
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Figure 4.2: The block diagram for channel equalization in transform domain. 


domly chosen from {—1;+1}. Although the transmitted signal x;,,(n) represents a random 
sequence, the samples 2,,,(7) at the output of the channel are correlated due to the chan- 
nel coefficients. Due to this fact the autocorrelation matrix of the sequence x(n) can be 
in some cases very high and the convergence speed of the time domain implementations 
is low. This is the reason why we have implemented also the transform domain adaptive 


filters for this application. 


Another situation, which can occur in practice, is when the channel noise v(n) have 
non-Gaussian distribution. For instance, when the noise distribution is impulsive, the 


LMS have stability problems. This is due to the fact that the impulses contained in 


1 


v(n) influences the adaptation of the coefficients through the term x(n)e(n)'. For such 


lWe note that in this application, the noise appears just at the input of the adaptive filter. However 
due to the fact that the desired sequence d(n) is bipolar, a certain degree of impulsivity exists at the 


output of the adaptive filter as well. 
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situations the Order Statistics LMS algorithms might be an alternative solution. 

Computer experiments, showing the performances of the OSLMS algorithms described 
in Chapter 2, are presented in the next section, for various noise distributions. For the 
case of Gaussian noise distribution in Section 4.1.2 the results obtained with the variable 
step-size LMS algorithms in time and transform domain are presented. 


4.1.1 Channel equalization in non-Gaussian noise environments 


The block diagram used in the experiments is depicted in Fig. 4.1 and the compared 
algorithms were the order statistic LMS from Section 2.5. A delayed version of the 
transmitted sequence x;,(n), with an appropriate delay D, is used as the desired signal 
d(n) = xi,(n — D) for the adaptive filters. The length of the channel was N,, = 11, the 
lengths of all the compared adaptive filters were N = 11 and the length of the weighting 
vector (a and a(n)) for the gradient was L = 7 (see Section 2.5). 

The distribution of the channel noise v(m) has a generalized exponential density given 
by: 


p(r) = ke Fal? |r| <oo, 0< Bb <0, (4.1) 


where k, and ky are given by: 


cole 
ky = (Bky/”) /2r (5) jkp= |Z] 78 (4.2) 
p is 
with [ being the ordinary gamma function and 07? the standard deviation. 

In order to have a fair comparison the step-sizes of all the algorithms were chosen 
to give comparable convergence speeds and the step-size for the Z-LMS algorithm was 
chosen to be jz = 0.1 which satisfies the stability condition from Section 2.5. 

As (3 in (4.1) increases from a value close to zero, the resulting density varies from 
highly impulsive to Gaussian and to uniform. However, the gradient has a certain degree 
of impulsivity also in the case of Gaussian and uniform channel noise due to the desired 
signal d(n), which is bipolar. Therefore, we expect that the Outer Mean LMS does 
not give satisfactory results for any considered noise distribution. More than that, the 
gradient distribution is also influenced by the distribution of the channel noise and input 
sequence z(n) and the performance of the Median LMS is expected to be also poor. With 
these observations we can expect that among OSLMS algorithms with fixed weighting 
coefficients the Trimmed Mean LMS would have the best performance. Note, that an 
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Figure 4.3: Steady-state mean squared error for median LMS, MxLMS, OxLMS and 
AOSLMS for SNR = 0 dB. 


algorithm similar to the proposed AOSLMS in which not only the trimming coefficient is 
adapted but also the envelope of the weighting coefficients would give even better results. 


The simulations showing the performance of the AOSLMS filter compared with other 
OSLMS algorithms are given in Fig. 4.3 and Fig. 4.4 for different noise distributions 
and different signal-to-noise ratio at the input of the adaptive filters. In these figures, the 
steady-state MSE for each compared algorithms are plotted. The results shown in Fig. 
4.3 were obtained for a signal-to-noise ratio of SNR = 0 dB, at the output of the channel, 
whereas in Fig. 4.4 the signal to noise ratio was SNR = 10 dB. From these figures, we 
can see that the algorithm which uses an adaptive filter to smooth the gradient gives 
better results for almost all considered noise distributions and this is in agreement with 


the theoretical considerations from Section 2.5. 


Here, we have applied a new AOSLMS adaptive filter to the problem of channel equal- 
ization for non-Gaussian noise environments. The approach of channel equalization differs 
from that of the system identification, in which the impulsive nature of the gradient is 
mainly given by the noise present in the system. Usually, in the case of channel equal- 
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Figure 4.4: Steady-state mean squared error for median LMS, MxLMS, OxLMS and 
AOSLMS for SNR = 10 dB. 


ization it is difficult to predict the distribution of the gradient and hence the optimal 
weighting vector to smooth the gradient. In such cases, the proposed AOSLMS algo- 
rithm would give better results due to its ability to adapt the weighting coefficients to the 


unknown gradient distribution. 


4.1.2 Channel equalization using variable step-size adaptive al- 


gorithms 


The algorithms compared here, in channel equalization framework, are the time domain 
and transform domain algorithms with fixed and variable step-sizes that were described in 
Chapter 2 and Chapter 3. The block diagram for the transform domain implementations 
is depicted in Fig 4.2 and for the time domain implementations in Fig. 4.1. The compared 
algorithms were: the plain LMS, the Variable Step-Size LMS proposed in [48], the robust 
variable step LMS (RVSLMS) from [1], the plain TDLMS using the DCT transform, the 
DCT-LMS using the modified power estimator proposed in [45] and the TDVSLMS from 
Section 3.2 of this thesis. 


The transform used at the input of all transform domain implementations was the 
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4.1 Channel equalization 
Algorithm | LMS | VSSLMS | RVSLMS | TDLMS | DCT-LMS | TDVSLMS 
MSE 0.0155 | 0.0155 0.0157 0.0153 0.0161 0.0153 


Table 4.1: The steady-state mean squared error for the compared algorithms in the chan- 


nel equalization framework. 
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Figure 4.5: Mean squared error of the LMS, VSLMS, RVSLMS and TDVSLMS imple- 
mented for channel equalization. 


DCT. The signal to noise ratio at the output of the channel was SNR = 30 dB and the 
parameters of all the compared algorithms were chosen, such that they have comparable 
steady-state MSE. In order to setup the parameters of the implemented algorithms, we 
have followed the guidelines presented in Section 3.2. 


The plotted learning curves were obtained by averaging the squared errors of 200 


independent runs each run containing a number of 15 x 10* iterations. The steady-state 


mean squared errors obtained experimentally are given in Table 4.1 and these values were 


computed averaging the last 1000 values from the corresponding MSE’s. 


All the adaptive filters have the same number of coefficients N = 17 and the trans- 
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Figure 4.6: Mean squared error of the TDLMS, DCT-LMS and TDVSLMS implemented 
for channel equalization. 


mission channel has three coefficients given by the following model (see [45]): 


1 22; ae 
Fi 5(1 cos {= ( 2} th <a 13: (4.3) 


0, otherwise 


In Fig. 4.5, the learning curves obtained for the LMS, VSSLMS, RVSLMS and TD- 
VSLMS algorithms are depicted. In order to have a more clear representation, just the 
first 4000 samples of each learning curve are plotted. We can see from this figure that the 
TDVSLMS clearly has higher convergence speed than the time domain implementations. 
This is expected since the input signal into the adaptive filter was highly correlated due 
to the coefficient W = 3.75 in (4.3). 

A more interesting result is presented in Fig. 4.6, where the TDVSLMS algorithm is 
compared with the plain TDLMS and the DCT-LMS using the modified power estimator. 
We can see from this figure that the TDVSLMS is the fastest algorithm among these three 
transform domain implementations. 

The transform domain variable step-size LMS algorithm described in Section 3.2, ap- 
plied to the problem of channel equalization, shows better convergence speed compared 


to other well known time domain and also transform domain algorithms. We have seen 
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in Chapter 3 that the computational complexity of the TDVSLMS algorithm is compa- 
rable with that of the plain TDLMS, which makes it a very good candidate for practical 
implementations. Another transform domain algorithm which can be implemented in 
this framework is the TDCPVSLMS from Section 3.2 of this thesis. Although it is more 
computational expensive, the setup of its parameters is simpler. 


4.2 CDMA multiuser detection 


Code Division Multiple Access (CDMA) using the direct sequence (DS) spread-spectrum 
signaling has been implemented with success in telecommunication applications. Some 
of the main advantages of the DS/CDMA technique are: the ability of asynchronous 
operation, a better channel usage compared with other techniques that allows a single 
user to be transmitted over the channel at a certain time and its ability to operate in the 
presence of narrow band communication systems. When a given user is demodulated in a 
DS/CDMA system, two types of interferences must be minimized, namely the wide band 
Multiple Access Interference (MAI) and the Narrow Band Interference (NBI), as well as 
the channel noise. The MAI is caused by other spread spectrum users into the channel 
while the NBI interference is caused by other conventional communication systems. 

Among other demodulation techniques, the adaptive methods have been successfully 
applied to reduce both the MAI and NBI interferences in DS/CDMA systems. When the 
spreading code and the channel parameters of the desired user are known or can be esti- 
mated, the blind adaptive detectors can be easily used [36], [37], [42], [47], [67], [68], [81], 
[75], whereas in absence of these pieces of information the trained based implementations 
are preferred [50], [53], [57]. 

In the trained based systems a known training sequence is transmitted which is used 
to tune the coefficients of the adaptive filter before the actual data is sent. The well 
known adaptive algorithm used in both blind and training based demodulators is the 
LMS algorithm which has the advantage of having a simple implementation and low 
computational complexity. However, the main disadvantages of the LMS algorithm are 
its slow convergence when operating with highly correlated input signals and the trade-off 
between the convergence speed and the output error, as we have pointed out during this 
thesis [41]. In order to reduce these disadvantages many of its variants where introduced 
in the open literature, such as the class of Variable Step-Size LMS algorithms. 

In this section, we analyze the behavior of different VSSLMS adaptive algorithms 
for the problem of multiuser detection in a synchronous CDMA system. We show, by 
means of simulations, that the Complementary Pair Variable Step-Size LMS (CP-VSLMS) 
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Figure 4.7: Block diagram of an adaptive detector using the LMS algorithm. 


adaptive algorithm introduced in Section 2.2 possess a faster convergence speed than other 
known algorithms, while reducing the trade-off between convergence speed and steady- 


state output error. 


4.2.1 Problem formulation and theoretical background 


For the sake of simplicity, we consider a synchronous CDMA system in which a number 
of K users transmit over a single-path time-invariant channel. The processing gain is 
denoted by N, the attenuation of each user data are denoted by a, and the data symbols 
transmitted by all users are aligned in time. 

The received signal, sampled at chip rate, can be written in vector form as follows: 


r(n) = SAd(n) + v(n), (4.4) 


where the j“” column of S represents the received spreading code of the 7” user, the vector 
d(n) = (d;(n),...dx(n)|" contains the data symbols transmitted by all users at the time 
instant n, the N x 1 vector v is the sampled channel noise and the K x K matrix A is 


given by: 
a, O 0 0 
Re O° de 0 0 
0 O | 0 aK 


Assuming that the desired user is user 1, a block diagram of a trained based detector 
using the standard LMS adaptive algorithm is depicted in Fig. 4.7, where r(n) is the 


A 


as A t 
input vector described in (4.4), h(n) = jiu (n), nde hiv(n)| is the N x 1 vector containing 
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the coefficients of the demodulator, d(n) is the known desired sequence that is the same 
as the data sequence transmitted by the user | and e(n) is the output error. 

The LMS adaptive algorithm used to train the coefficients of the adaptive filter h(n) 
can be described by the following steps (see also Section 2.2 of this thesis): 


1. Compute the output of the adaptive filter h(n): 


j(n) = =o hi(n)ri(n (4.5) 


where r;(n) is the i” element of the vector r(m) in (4.4). 
2. Compute the output error: 
e(n) = di(n) — y(n), (4.6) 
3. Update the coefficients of the adaptive demodulator: 
h(n +1) = h(n) + pe(n)r(n). (4.7) 


where p is a constant parameter called step-size, which controlls the steady-state 
error and the convergence speed. 


For the training based detector the convergence speed is governed by the eigenvalue 
spread of the input autocorrelation matrix which is defined as follows: 


R=E[r(n)r'(n)] = E[SAd(n)d'(n)A‘S'] + E[v(n)v"(n)], (4.8) 


where we have assumed that the elements of the vector v are random zero-mean and 
independent from $, A and d(n). 

It is clear from (4.8) that the eigenvalue spread of the input autocorrelation matrix R 
can be far from unity and an adaptive demodulator using the standard LMS algorithm 
will have a very slow convergence. Since in the case of training based detectors, during the 
adaptation period no data sequences can be transmitted, a slow convergence will decrease 
also the transmission rate. Therefore, in practical applications, the convergence speed of 
the detector has to be increased while maintaining a small steady-state error. 

Besides a slow convergence, the plain LMS algorithm has also the disadvantage of a 
trade-off between speed and steady-state output error. Indeed from (4.7) we can see, that 
in order to obtain a small steady-state error, one has to choose a small step-size, but a 
small value of js decreases the speed of convergence of the algorithm. 

In the sequel we study, by means of computer experiments, the behavior of the adaptive 
algorithms described in Section 2.2 of this thesis. 
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Table 4.2: Steady-state MSE and the parameters for the compared algorithms for CDMA 
multiuser detection. 
Algorithm Parameters Steady-State MSE (dB) 
LMS (iti SIO" -17.4256 
LMS jee el -14.0252 
ingao X10". fe S307 2, 
CP-VSLMS -17.4251 
O=10.9; = 0) i 
ina o S10 ss. fie 3 elle, 
VSSLMS -17.2241 
a = 0.9, y = 0.002 f 
inns 310) Pee 3 10", 
RVSSLMS -17.264 
a = 0.97, 6 = 0.99 SD eee 


4.2.2 Simulations and results 


We compare the performances, in terms of convergence speed and steady-state MSE, of 
the CP-VSLMS, plain LMS, VSSLMS, and RVSLMS algorithms, in the CDMA multiuser 
detection framework. The signal model is given in (4.4) and the number of users was 
kK = 4 with the first user being the user of interest. The attenuation of the first user was 
10 dB below the attenuation of the other three users. The spreading codes were chosen 
from a set of Gold sequences of length N = 31 and the channel noise v(n) was white 
Gaussian with zero mean and variance 0? = 10~”. The transmitted data (the elements of 
the vector d(n) in (4.4)) were equiprobable bipolar sequences with values in {—1, +1}. 

The parameters of all the tested algorithms are presented in Table 4.2 together with 
the corresponding values of theirs steady-state MSE. These parameters were chosen to 
give comparable steady-state MSE for all adaptive filters. One exception is the LMS, 
implemented with fixed step-size Umax = 3 x 107°, which was included for benchmark 
purposes. The learning curves (the output MSE during the adaptation) for all algorithms 
are shown in Fig. 4.8, Fig. 4.9, Fig. 4.10, Fig. 4.11 and Fig. 4.12. These results 
were obtained by averaging a number of 100 runs of length 4 x 10* iterations. From 
these figures, we can see, that the CP-VSLMS has faster convergence compared with the 
VSSLMS, RVSLMS and the LMS having a small step-size while their steady-state MSE 
are comparable. 

In order to have a more clear insight of the behavior of the compared algorithms in Fig. 
4.13, Fig. 4.14 and Fig. 4.15 the expected value of the step-size during the adaptation 
for the CP-VSLMS, VSSLMS and RVSLMS respectively are plotted. From these figures, 
we can see, that the step-size of the CP-VSLMS algorithm has the smallest variations at 
the steady-state and also its value is very close to [min = 3 X 1074. These results are 


4.2 CDMA multiuser detection 135 


Mean Square Error for LMS with Hin Mean Square Error for LMS with Max 


-20 : ; L 20 L i f 
0 0.5 1 1.5 2 0 0.5 1 1.5 2 
Iterations Iterations x10! 


x 10° 


Figure 4.8: Output mean squared error for 
the LMS with pimin = 3 x 1074. 


Step—Size behaviour of the CPVSLMS 
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Figure 4.10: Output mean squared error 
for CP-VSLMS. 


Figure 4.9: Output mean squared error for 
the LMS with pmax = 3 x 107°. 


Step—Size behaviour of the VSSLMS 
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Figure 4.11: Output mean squared error 
for VSSLMS. 


in agreement with our theoretical considerations from section 2.2 that the steady-state 
misadjustment of the CP-VSLMS is given by [min. 

In Chapter 2, the computational complexity and memory load? of the variable step- 
size algorithms were compared and we have seen that the computational complexity and 
memory load of the CP-VSLMS algorithm are almost double compared with the other 
algorithms. However, the benefit of the proposed algorithm is the increased convergence 


speed and the fact that the dependence between the speed of convergence and the steady- 


?The number of memory locations necessary to store the variables and the parameters. 
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Step—Size behaviour of the RVSSLMS 
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Figure 4.12: Output mean squared error 


for RVSLMS. 


Step—Size behaviour of the VSSLMS 
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Figure 4.14: 
VSSLMS. 
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Figure 4.13: Step-size behavior for CP- 
VSLMS. 


Step—Size behaviour of the RVSSLMS 
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state error is eliminated. Indeed the steady-state misadjustment of the CP-VSLMS is 
given by its steady-state step-size which is 4;(00) = [min, Whereas the speed of conver- 


gence can be tuned selecting the other parameters, such as, Q, mar and T’. In the case of 
VSSLMS and RVSSLMS the equations that gives the values of the parameters, provided 
in [48] and [1], are sometimes difficult to be used, due to the fact that they depend on 


the minimum MSE. 
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Figure 4.16: Block diagram of the local echo cancellation using the Scrambled LMS with 
adaptive length. 


4.3 Scrambled LMS with adaptive length for echo 


cancellation 


In this section, we address the problem of local echo cancellation for digital data transmi- 
tion over a telephone line. Moreover, the transmition is secured by the use of a scrambling 
device at both users as depicted in Fig. 3.16. In our discussion we assume the length 
of the echo path unknown and the adaptive filter implemented is a combination of the 
scrambled LMS described in the previous chapter and the variable length introduced in 
Section 2.3. The aim of this implementation is to reduce the bias which appears in the 
steady-state MSE when the length of the echo path and the length of the adaptive filter 


are not equal. 
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A detailed block diagram implemented in our simulations is depicted in Fig. 4.16 
where we have used three adaptive filters with different lengths as in the approach of 
Section 2.3. The transmitted digital data x(n) has samples chosen from {—1;+1} and it 
contains long strings of consecutive —1’s and +1’s. The sequence x(n) is passed through 
a scrambling device, which generate the scrambled sequence 2,,.(n). As we have seen in 
the previous chapter, the scrambling device decorrelate the transmitted data, therefore 
the analytical results for length adaptation from Chapter 2 can be used here?. 

The far-end sequence v(n) is obtained by adding a random zero mean Gaussian- 
distributed sequence and a bipolar sequence with samples from {—a;+a}. The Gaussian 
component of v(m) simulate the channel noise whereas the bipolar component of u(n) is 
due to the transmitted data from user B to user A (we have assumed that the hybrid 
connections of the telephone line are ideal). 

The transfer function of the echo path used in the simulations was [72]: 


H(z) = me (4.9) 


where p = 0.80025 and the impulse response of the echo path is shown in Fig. 4.17. 

The attenuation of the transmitted data from user B to user A was chosen a = 0.1 
and the variance of the channel noise was o? = 107°. The length of the echo path model 
in (4.9) is N = 19 and the lengths of the adaptive filters were initialized with N,(0) = 7, 
N2(0) = 8 and N3(0) = 9 respectively. A step-size 1 = 10~? was used to initialize 
the step-sizes of all adaptive algorithms. This value of ju satisfies the stability condition 
and also ensures equal misadjustments. A time-varying test interval was used for length 
adaptation in (2.113) and the parameter P was chosen as P = 2. 

All the results were obtained by averaging a number of 100 independent runs each of 
them containing 10* iterations. The same algorithm as the one described in Section 2.3 
was implemented for length adaptation and the adaptive filter of interest is ha(n). The 
average of the length No(n) of the second adaptive filter ha(n), during the adaptation, is 
shown in Fig. 4.18. The value of N2(n) during one run is shown in Fig. 4.19. We can 
see, that the length of the second adaptive filter converges close to the optimum length 
which is N = 19. However, there is a small difference between the steady-state value of 
E{N2(n)} and N = 19 due to the fact that the autocorrelation matrix of the scrambled 
sequence x,-(n) is not diagonal and the off-diagonal terms influence the minimum MSE 
as explained in Chapter 2. 


3 Actually the autocorrelation matrix of the scrambled sequence ..(n) is not perfectly diagonal, there- 


fore a certain deviation from the optimum length is expected to occur. 
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Figure 4.18: The average of the length Figure 4.19: The length of the second 
N2(n). adaptive filter during one run. 


The MSE of the second adaptive filter is depicted in Fig. 4.20, whereas the MSE of an 
adaptive filter with constant length N = 19 is shown in Fig. 4.21. The step-sizes used in 


the adaptation were chosen to obtain the same misadjustments for both adaptive filters. 
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Figure 4.20: Output mean squared error 
of the second adaptive filter. 
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Figure 4.21: Output mean squared error 


of an adaptive filter with optimum length. 
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Figure 4.22: Mean squared error for the case of a constant length smaller than the length 


of the echo path. 


For comparison purposes in Fig. 4.22 the MSE of an adaptive filter with fixed length 


Naa = 7 is depicted. The learning curve is obtained in the same framework as the ones 


shown in Fig. 4.20 and Fig. 4.21. Clearly the filter with adaptive length converges to 
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a smaller steady-state MSE than the filter with fixed length Nag < N. Also, due to the 
transient time of the length adaptation, the filter hy (n) converges in a slower form than 
the filter with optimum length. 

From the results shown in this section, we can conclude that in echo cancellation 
applications, a very important issue is the length adaptation. If a smaller adaptive filter 
is used, the steady-state MSE is larger as compared with the situation when the length of 
the echo path is known. When a scrambling device is used to secure the data transmission, 
this also performs a decorrelation of the input sequence into the adaptive filter and an 
algorithm as the one introduced in Chapter 2 can be implemented for length adaptation. 
However, due to the imperfect decorrelating properties of the scrambling device, the off 
diagonal terms of the matrix R,. are non-zero and the estimation of the length is not 


perfect. 
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Chapter 5 
Conclusions 


This thesis has introduced several new algorithms for adaptive filtering, all of them derived 
from the well known Least Mean Squared adaptive algorithm, which is widely used in 
many practical applications due to its simplicity. Despite its simplicity, the LMS has 
some major drawbacks which are mentioned and discussed during this work. One goal of 
this thesis has been to analyze each of these inconveniences of the LMS and to provide 
solutions to improve its performance in terms of convergence speed, adaptation error, 
tracking capabilities and stability in Gaussian and non-Gaussian noise environments. 

The new developed techniques differ by addressed framework, selection of parameters 
and application. We have derived two new adaptive algorithms with variable step-size for 
time domain which provide high convergence speed while maintaining a small steady-state 
error. The proposed algorithms use the output error in the adaptation of the step-size 
and this concept was utilized by many researchers. The difference between the new 
methods and the existing approaches is that the output error is not directly included 
in the expression of the step-size, although the step-size adaptation is still based on the 
mean squared error. As a consequence, the analytical expression of the steady-state 
misadjustment is simplified’ and the setup of the parameters is very easy. Even more 
the parameters of the proposed algorithms are not sensitive to the level of the signals 
involved. 

It is well known that the transform domain LMS increases its convergence speed for 
highly correlated inputs in comparison with the time domain implementations. We have 
shown here that using the same concept of step-size adaptation, but in transform domain, 
the convergence speed can be even more increased. As a consequence, a new class of 


transform domain adaptive algorithms with variable step-size has been introduced in this 


' Actually it is shown to be the same as for plain LMS with fixed step-size. 
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thesis. 

The estimation of the model length in system identification applications might be 
of great interest. Also when there is a mismatch between the length of the model and 
adaptive filter, the output mean squared error is increased. To address this problem, we 
have introduced a variable length LMS algorithm in which not only the coefficients of the 
adaptive filter are adapted but also its length. 

Tracking capability for identification applications is also addressed in this thesis and 
two new algorithms with adaptive step-size are introduced. In a time-varying environ- 
ment, the steady-state mean squared error possess a minimum for a certain value of 
the step-size. In the existing approaches the optimum step-size is computed based on 
some prior information about the statistics of the model. Our proposed algorithms do 
not use any information for step-size adaptation. The aim of step-size adaptation for a 
time-varying environment was not to increase the convergence speed, but to adapt their 
step-size toward the optimum. However one of the two proposed algorithms is derived for 
the transform domain which provides also an improved convergence speed. 

Non-Gaussian noise environments are known to be difficult tasks for the LMS al- 
gorithm due to the use of the instantaneous gradient to update the coefficients of the 
adaptive filter. Due to this fact the LMS may have stability problems for impulsive dis- 
tributions of the signals. An answer to this problem is to use a smoothed version of the 
gradient in the update formula. The gradient can be smoothed using different nonlinear 
filters and the resulting algorithms are available in the open literature. The nonlinear 
filter has to be chosen based on the distribution of the gradient. In our new approach we 
use an nonlinear filter with adaptive coefficients for smoothing the gradient and the prior 
information about the gradient distribution is not any more necessary. 

The scrambled LMS algorithm was primarily introduced for the applications where 
there is a necessity to secure the data transmition. However scrambling was shown to be 
a good decorelation technique, which can increase the convergence speed. The question 
which arises is how the scrambled LMS performs comparing with the transform domain 
LMS. To have an insight to this problem, the analytical expressions of the steady-state 
mean squared error and mean squared coefficient error for LMS, TDLMS and scrambled 
LMS are derived in this thesis. These expressions are obtained for the special case when 
the input sequence has equals samples, in the framework of digital transmission over a 
telephone line. The optimum solution of the compared algorithms is also derived and the 
simulations results supporting the theoretical considerations are presented. 

As a final conclusion we can state that the contributions of this thesis have both 


theoretical and practical importance succeeding to introduce several solutions to improve 
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the behavior of the Least Mean Squared adaptive algorithm. All the proposed algorithms 
were developed based on analytical expressions derived for different situations which can 
appear in practice. Moreover, this thesis work can also provide many other possible topics 


of future research. 
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